The Hidden Challenges and Misconceptions about AI-based Tools in Vision Research
Part 2 in a series on Artificial Intelligence in Ophthalmology. Read Part 1.
By Joy Curzio
Artificial intelligence (AI)-led advancements in vision research and science, which can range from tools that patients can use at home to monitor disease progression to algorithms that can identify patient characteristics such as gender, age, and smoking history, just from a retinal image — can, at worst, facilitate basic research by improving the speed and accuracy of data processing. At best, AI might one day also directly improve patient care in new and innovative ways.
Before any improvements in patient outcomes with AI tools can be seen, however, vision researchers must be able to acquire new skills essential to effective, collaborative teams, understand and implement new solutions for data acquisition and quality control, and use innovative blends of resources to achieve success.
Much of the AI currently being implemented in vision research and science is based on machine learning (ML)-based algorithms. These ML algorithms typically employ large structured data sets that are, at least initially, manually created and labeled by humans. ML algorithms incorporate the human-defined data to “learn” to identify similarities or anomalies in additional future data sets, and human oversight is key to ensuring quality control of any ML-algorithm output. Deep-learning (DL) artificial neural networks also are used in vision research, but these rely less on processing of structured data and more on the passing of queries through various conceptual hierarchies, much in the same way a human brain works. Although it is relatively easy to see the learning path for an ML-based algorithm, DL-based networks are much less transparent and for this reason are often known as “black-box” technologies. (Challenges and best practices regarding transparency of patient data, as well as insights about how AI can potentially transform patient care, will be discussed in the next installment of the AI article series.)
Creating algorithms without huge expenses, highly technical skills
Although both DL networks and ML algorithms can require powerful computing networks with a high number of graphical processing units to produce and verify high-quality data sets, ML algorithms themselves can be very inexpensive to create. Several automated ML platforms are available for use, allowing individuals with nonspecialized backgrounds to create and train simple algorithms for very little financial investment. Examples include. In addition, experts note that the use of cloud computing has negated the need for expensive hardware on which to store these large structured data sets.
“There are discounted hardware and software products now available for researchers and students, and there is a push for every major machine-learning framework to be open sourced so that it can be downloaded for free,” according to Jian Dai, principal data scientist at Roche/Genentech. “There is also committed support in AI research regarding sharing of neural network wave files and access to preprint online archives for research papers. So, it’s more about building a talented team to attack the right kind of problem than it is about the hardware and the software.”
ML is a relatively new field, with certifications and degree programs just starting to become more popular. Collaborative research teams within larger companies working in research and development of vision-related AI, such as Roche/Genentech, are starting to include computational engineers alongside of clinical experts. Many of the recent graduates specialized in AI, however, are being hired by technology companies, leaving very few for the academic setting. Because of this, the leveraging of skill sets on research teams, regardless of how those skills have been attained, is more important than skill sets themselves. Researchers with very little coding experience can participate in the building and training of algorithms, and clinical experts are starting to familiarize themselves with AI platforms and techniques.
“There are mature convoluted neural networks that have been developed and can already differentiate everything from a cat to a hat,” said Louis Pasquale, MD, a glaucoma specialist and AI enthusiast. “So actually, it isn’t that expensive, you just need access to a server, cloud computing, and a laptop. Someone just has to collect and annotate the images, and high-quality imaging hardware has to be purchased. That’s where the costs are.”
The tough part: External validation of generalizable data
Because ML algorithms and computational models often require extremely large, structured data sets, the uniformity of the data and its quality are key to the success of any output. For example, a large data set collected at an institution in a developed country consisting of white patients aged 70 to 80 with macular degeneration can be used to build an algorithm. However, according to Pearse Keane, , a retinal specialist at Moorfields Eye Hospital and a researcher at the University College London Institute of Ophthalmology, these data will not be transferrable to other parts of the world. Instead, equally large data sets of individuals native to Brazil or South Africa, for example, of sufficient quality are needed to ensure that the algorithm works across geographic regions, data sets, and individual characteristics.
According to Keane, ensuring generalizability and consistent quality of data is a multifactorial challenge. Everything from the type of camera used for imaging to the protocol used in each individual center used for image collection and annotation can cause small alterations in the data quality that will be reflected in the algorithm output if there is no human oversight. Keane posited that external validation of data output is actually going to be one necessary area of skills development in the future, especially for algorithms that will be used in direct clinical care.
“AI is not magic. The current techniques are quite data-heavy, so they work very well with large amounts of data but don’t yet have the ability to learn as well from small amounts of data. In addition, particularly when dealing with healthcare-related data, there is the question of how you collect multi-institutional international data when all of the entirely appropriate measures are in place around data protection and privacy,” Keane said.
"AI is not magic. The current techniques are quite data-heavy, so they work very well with large amounts of data but don’t yet have the ability to learn as well from small amounts of data."