Machines are getting smarter at understanding our world. Not only can they turn words into pictures, but they’re also getting better at recognizing objects and people. Thanks to advancements in AI supercharging automated identification, image-to-text transcription has made leaps in extracting text from images.
This paves the way for even more powerful Optical Character Recognition (OCR) technology that can tackle complex layouts, diverse fonts, and handwritten text with increasing efficiency.
For the inside track on how OCR is helping teachers in the classroom, we chatted with Amber Chung, Product Manager at AVer’s Integrated Presentation and Education team, about the newly introduced OCR function in AVerTouch, the software designed exclusively to manage AVer visualizers (document cameras).
Purpose of AVerTouch OCR
The idea of scanning documents is often more associated with the office environment. Could you set the scene for why it’s relevant in the classroom?
Despite the rise of digital tools in classrooms, many teaching materials remain in paper format. This creates a hurdle for teachers who need to share these physical resources with their students electronically.
Typically, getting documents into digital form involves using a communal scanner-printer or a scanning app on a mobile phone. After scanning, you’ll need to save or upload it somewhere easy to find, so you can access and project it or show it on a display in class later on.
This process is tedious and time-consuming, especially when dealing with large quantities of materials.
How did your team address the challenges of getting content on paper into digital form?
We focused on, firstly, making scanning documents easily accessible and, secondly, improving efficiency when scanning lots of documents.
What better way to tackle these challenges than right there in the classroom, where teachers are already using AVer visualizers? It’s a familiar setup for them, so there’s really no learning curve. That’s why we added OCR as a new feature within AVerTouch, the software that manages their visualizer.
The goal is to give teachers an all-in-one solution where they control every step — super easy to scan, select the digital file format to save the document, and then share it with students or other teachers.
Scan, save, and share… that’s a nice soundbite!
Yes, and all of that can be done with AVerTouch and the OCR function embedded in the AVer visualizer. That integrated approach of hardware and software is a key differentiator with AVer’s solutions for education. It’s one smooth process, one ecosystem by design.
Functionality of AVerTouch OCR
Could you elaborate on how AI works with OCR?
AVer has deep expertise in developing a range of image-processing AI algorithms for video processing. When applied to enhance OCR results, that makes for a very user-friendly experience.
A teacher can simply “place and scan” documents and materials. AI capabilities, such as auto cropping, automatic page orientation, keystone correction, edge fill, and A3/A4 page sizing, to name a few, work behind the scenes to seamlessly produce a clear and ready-to-use digital document.
What feedback from users informed the design of the OCR features?
In our research, we found that the amount of light in the classroom can impact how accurately the OCR reads text. To address this sensitivity, we included an additional feature that allows teachers to manually fine-tune the contrast and brightness settings. So, the combination of AI and human input, when needed, ensures great scan results under virtually any lighting.
Our OCR function also supports multiple-language recognition. Up to 133 languages, in fact. That’s an important requirement for our users. In one of our field tests, a Japanese teacher praised the accuracy of the scans, which was particularly impressive given that many Japanese books use vertical text formatting.
And for students with visual impairments, having searchable and editable text in digital form can be incredibly useful. Scanned text can be put through a text-to-speech app, which allows for a quick way to access the spoken word.
Future Developments for AVerTouch OCR
What are the future development plans for the OCR function in AVerTouch?
The potential of adding insight to recognition is an interesting area. Although current OCR technology does very well at character recognition, the future holds exciting possibilities for intelligent character recognition (ICR) and intelligent word recognition (IWR) technologies. These advancements go beyond just reading images and words to unveiling deeper meaning from text.
For instance, imagine a much more efficient way for teachers to grade test papers. The teacher scans the students’ handwritten test papers and then AI checks them against a model answer template. A final review by the teacher is essential, to be sure, but the OCR function with next-gen AI recognition technologies can really do a lot of the heavy lifting.
References
- Chung, Amber et. al. In-person Interview. AVer. 2024.