We are all affected by the COVID-19 pandemic. We all want to get back to normal. Safe living and safe doing business, as soon as possible. Fortunately, there are very hopeful developments with vaccines but it will take time to vaccinate hundreds of millions of people. Instant testing will be in many cases and circumstances required and will become an integrated method in our daily lives. Spektrax is an organisation developing a completely novel nanotechnology-based ultra-rapid virus test kit that will take only a few minutes to analyse in-vitro samples. They use a portable, handheld device to test and validate virus presence. Their virus test kit technology turns these samples into high dimensional data. But how should such data be interpreted and how can a split-second classification be made defining if a virus is present or not? These two challenging questions have been boggling all of our minds for the past months.
From early in the development process of this cool new technology we have been aiding Spektrax with finding the answers to such challenges. While working as fast as possible, constantly iterating on interpretation methods, together we have identified how to turn Raman spectroscopy data into an understandable input for a machine learning algorithm. Often, it is good to keep in mind that machine learning is not magic. I am saying this because if the data does not show any signs of potential differences between classes, an algorithm will not be able to identify or even add such differences in the data. We first started to identify the difference between high and low quality datasets and create visualisations which show wether or not the samples contain the right information. This led to insightful information for the lab team on how to improve this data even further. After all, that is where it starts and where the quality needs to be as high as possible. In order to gain insights as soon as possible a data warehouse was set up, combined with a web application where the lab team could upload new data and instantly see the visualisations.
During the process of turning data into insights, Spektrax even organised a 1.5 meter distance hackathon to speed up the process of various challenges, for example data preprocessing and quality control. It was vital to be flexible and fast during the development and validation phase of the data science solution.
The next step in the development process was to develop a classification model, which can be used realtime anywhere. The difficulty in this process was to prevent the model from overfitting. It proved to be very important to stay clear of any bias at all times, from the person testing in the lab and the actual device to the training of the model. Any bias could lead to undesirable classifications. While cross-validating each data set and adding more data continuously, the importance of the right algorithm became clear quickly. Of course, this is an ongoing proces of research and development which is currently conducted further by the in-house data science team of Spektrax. It has been a great ride to be a part of the data tech team of such an ambitious startup. The complete solution is still undergoing further optimisation, but our work here is done. Even though there is currently a huge demand for fast screening, Spektrax is only launching their system after thorough validation. Their realistic aim is to bring our first reliable and validated tests to the market in 2021. We really enjoyed working with the Spektrax team on this highly innovative technology and are looking forward to their future successes.