Our client is a technology start-up founded by one of the largest healthcare providers in the US (UPMC, from top-20 in the USA) to examine de-identified PHI for analytical purposes. They provide services dedicated to removing PHI patient data from different medical data sources under HIPAA Compliance rules.
Industry
Medical Documents
Recognition
Team size
9 members
Duration
6 months
Status
Completed
As a startup, our client was lacking strong technical and engineering in-house expertise to translate their ideas into reality. The client was aiming to build the first version of the framework as early as possible, as they intended to enter a niche market with their unique product.
There were no ready-made solutions that allowed for processing multiple types of data using modern technologies to de-identify patients’ PHI data, so the solution had to be built from scratch.
At the beginning of the project, our client had the basic idea of what results the expected and general business requirements, but they had no technical description of the project or codebase.
The main challenge was to be the 1st to the market performing at a commercial level. This involved creating a technical plan, solution architecture design, and a roadmap for developing a solution under tight deadlines.
They needed a team of experienced developers able to create a competitive product in full compliance with business needs. The main requirements for development were:
It was necessary to support the product delivery pipelines via CI/CD, including building new framework versions, deployment environment, and testing automation running.
The Akvelon team started with the business requirement analysis. Using this data the team composed a roadmap with all the features to meet MVP. Along with the client, we took part in writing epics and splitting them into user stories with further planning and estimation.
The Akvelon team was responsible for the delivery process, the manager from our side driving the development process, including all Scrum and Agile ceremonies.
As our team was tasked with building a SAAS, AWS was chosen as the core platform for developing the solution because it was fully compatible with security, scalability, and serverless requirements. Spark was chosen as the main data processing engine. To find PHI data the team utilized the spaCy NLP library, which has pre-trained ML models, as well as the ability to retrain ML models and add new models.
Several pipelines in the AWS batch were developed for processing various data types:
We built a separate NLP module based on spaCy with standard pre-trained modules and custom ML models trained on Harvard medical data set with open medical data to identify PHI data in the text. This module was utilized for processing all data types.
To make the system easy to use for data analysts and honest brokers, the Akvelon team has created a desktop application (Win, Mac) to run data processing without accessing the AWS Console.
To be able to easily deploy/destroy/modify AWS infrastructure the team has developed CLI (which was like AWS CLI, designed and structured the same way).
Licensing services built by our team helped prevent access to the system for users who don’t have permission/license.
Within 6 months, the Akvelon team built from scratch, documented, and delivered a ready-to-use system which was fully compatible with the client’s requirements. The software allows the user to de-identify major PHI data entities such as names, addresses, SSN, telephones, etc. with a high-level of accuracy. With the client we have created a product backlog and defined further product development strategies.
Cloud: AWS Batch, AWS ECR, AWS S3, Aurora, AWS Lambda, API Gateway,
Data process: Spark, PySpark, Python, Scala, spaCy
CI/CD: Terraform, GitHub, GitHub Actions
We Make BIG Software Happen. Contact Us