Title: Exploring Ways to Automate Image Description Production for STEM
Legend: Proposed pipelines for automated image description. Top: Volunteer-assisted description that uses algorithmic outputs to guide a human describer. Bottom: Fully automated description that can be used to generate descriptions of mathematical content.
Citation: Tekin, S-A Ma, and K. Yamaguchi, “Exploring Ways to Automate Image Description Production for STEM,” at the 33rd Annual CSUN Assistive Technology Conference (2018), San Diego, CA, March 2018.
Abstract: Children who cannot effectively read print because of a visual, physical, perceptual, developmental, cognitive or learning disability, are considered to have a print disability. The highly visual nature of modern education, especially in STEM (science, technology, engineering, mathematics) fields, is a stumbling block for many students with print impairments. Even though accessibility offices and schools do their best to provide access to textbooks, accommodations for graphical materials often get deprioritized (due to a lack of resources) or cannot be provided in a timely manner. Increased and timelier access to course content is essential to allow more students with visual impairments to obtain degrees and achieve gainful employment.
We aim to streamline the process of generating accessible educational materials (AEM) by developing a set of free and open source software tools, called Image Categorization Expert System (ICES). ICES uses machine learning and computer vision techniques to identify different types of diagrams and automatically extract some information from images, initially focusing on eight categories of images frequently found in textbooks: equations (e.g., math equations and chemical formulas); diagrams/charts; maps; word art (includes titles and headings); photos; drawings/paintings; tables. We have achieved an average accuracy of approximately 88% among all categories, and over 98% for equations, which, combined with math transcription software, is very promising in terms of developing an end-to-end pipeline to provide automated access to equations. This can significantly improve the efficiency of volunteers, thereby leading to faster and more consistent production of digital AEM.
About the Lab: Loss of vision can frequently lead to a loss of independence and a reduction in quality of life for an individual. The Tekin lab is interested in harnessing new mobile technologies to provide access to environmental information for persons with vision loss. As mobile devices and wearable technologies proliferate, there are new avenues to interact with the constant mesh of devices and appliances in the environment, opening up new possibilities to improve the independence and self-sufficiency of persons with vision loss.
One of the lab’s main interest is using emerging technologies to improve communication aids for persons with vision and hearing loss, a fast growing segment of the population in developed countries as life expectancies increase. Whereas persons who have hearing loss but good vision can make use of facial cues to improve their speech reception, persons who have combined vision and hearing loss are unable to compensate for the loss of information in communication. The lab is exploring combining audio and video inputs in order to improve speech enhancement algorithms to aid speech reception for persons with such dual sensory loss.