HYBRID MULTIMODAL TEXT DIGITIZATION FOR PUBLISHING AND PRINTING
Abstract
This paper investigates AI application in text input for the publishing sector. It establishes a classification system for digitization methods based on text complexity and defines key selection criteria. To improve the processing of complex content, the author proposes a hybrid Optical Character Recognition (OCR) and Automatic Speech Recognition (ASR) approach, alongside a specialized multimodal algorithm integrated into publishing workflows.
Поліграфічні, мультимедійні та web-технології у цифровому середовищі. Том 1: колективна монографія
Downloads
Pages
335–344
Published
June 5, 2026
Categories
Copyright (c) 2026 Press of the Kharkiv National University of Radioelectronics
Details about this monograph
ISBN-13 (15)
978-617-8254-58-2