复杂背景图像中文本定位算法设计终稿

由天下分享时间：2024/10/2 18:06:04 加入收藏我要投稿点赞

毕业设计（论文）说明书

题目：复杂背景图像中文本定位算法设计

摘要

随着多媒体技术的飞速发展，复杂背景图像中的文本定位研究不仅丰富了图像处理理论，而且在诸如Internet环境下的图像检索、交通管理中的车牌识别等具有重大的价值。复杂背景文本定位是一个具有较大难度性的研究课题，原因是文本图像的背景非常复杂，图像大多在室外拍摄，光照条件变化较大，其中不同文字的颜色、亮度、字体、大小、间距、对比度、排列方向和背景纹理等有很大差别。要提取具有复杂背景的文本，首先要找到包含文本的区域，然后才能利用文本识别模块进行识别。本文综述了现有的主要文本定位方法，分析了其中的优缺点，实现了一种基于边缘检测和支持向量机的图像文本定位方法。其中，基于边缘检测的文本定位主要由金字塔分解、基于改进Canny算子的边缘检测、边缘提取和二值化、连通区域分析、以及文本区域鉴定与合并几部分组成。首先运用改进的Canny边缘检测算法检测出文本边缘，然后对检测结果进行连通区域分析、文本区域鉴定与合并得到候选的文本区域。进一步，通过将定位出的候选文本区域运用支持向量机的分类器训练的方法来提高文本定位的准确性。实验结果表明，该文本定位方法不但可以较准确的定位出相应的文本区域，而且具有一定的意义和较大的实用价值。

关键词：文本定位；边缘检测；特征提取；支持向量机

ABSTRACT

With the development of the multimedia technology, the study of locating texts under complicated background has not only enriched image processing theoretically, but also has enormous value in practical application. For example, the image retrieval under Internet environment and the discernment of the plate number in traffic administration. The location and extraction of text from complex background is an important research problem in the computer vision．The variation of the text in terms of characters font, size, style, orientation alignment, texture color and complex background makes the problem of text localization very difficult. The scene content is unconstrained and maybe both indoor and outdoor scenes under any lighting or contrast conditions．

To extract complex background text, text areas should be located first．Current text location methods ale researched in this paper, and the advantage and disadvantage of them are analyzed．Then text location method based on edge detection and support vector machines is implemented.

Edge detection based text location method is composed by Pyramid decomposition, improved Canny algorithm-based edge detection, edge abstracting and binary, connected component analysis, text region identifying and combination. First, the improved Canny algorithm is used

to detect the text edge, then connected component and text region identifying and combination is used to get the candidate text region．This paper uses the method of support vector machines classifier training to improve the correctness of text location. The support vector machine is applied to reduce the number of examples effectively, and the result of the experiment is good．

The result of the experiment shows that this algorithm can well and exactly locate the text, this algorithm is valuable in theory and application.

Keywords: text location；edge detection；feature extraction；support vector machines

第一章绪论 ....................................... 1

1.1 研究背景及意义 .................................... 1 1.2 文本定位研究的现状 ................................ 2 1.3 论文的主要研究容及结构 ............................ 7

第二章复杂背景图像中的文本定位的一般方法 .......... 8

2.1 文本特征及类别 .................................... 8 2.2 文本流程定位 ...................................... 9 2.3 文本定位方法 ..................................... 12 2.4 本章小结 ......................................... 18

第三章基于边缘检测的文本定位方法研究 ............ 20

3.1 引言 ............................................. 20 3.2 边缘检测 ......................................... 20 3.3 连通区域分析 ..................................... 32 3.4 文本区域定位与合并 ............................... 33 3.5 实验结果 ......................................... 34 3.6 本章小结 ......................................... 38

第四章总结 ...................................... 40 参考文献 .......................................... 42 外文资料中文翻译致

第一章绪论

1.1 研究背景及意义

图像中的文本定位是以数字图像处理为基础的，涉及到模式识别、神经网络、信号检测、认识科学等多门学科。随着光学字符识别(OCR)技术的兴起，许多学者开始进行文档图像中文字定位与提取的研究。图像文本定位作为OCR系统的一个预处理部分，对识别嵌入在复杂图像中的文本具有重要的作用。近年来，随着多媒体技术和计算机网络的飞速发展，全世界的数字图像的容量正以惊人的速度增长。每天都会产生海量的图像，这些数字图像中包含了大量有用的信息。目前的计算机视觉和人工智能技术都无法自动对图像进行标注，而必须依赖于人工对图像做出标注。这项工作不但费时费力，而且手工的标注往往是不准确或不完整的，还不可避免地带有主观偏差。所以如何从含有复杂背景的图像和视频中快速而准确地定位和提取文本，现在成为国际上热门的研究课题。

复杂背景是指：图像中的背景含有丰富的纹理；有时文本是嵌入在纹理中的，甚至有时文本本身就是纹理；文本的可能出现的位置、所受光照、字体、大小和颜色都不尽相同，而且这些在文本定位前都是先验未知的，这三点也正是这一研究的挑战所在。如果能够找到解决这些问题的方法，构造出解决复杂背景下的文本定位模型，对于丰富图像处理理论，对于基于容的视频检索技术的发展，具有重要的理论意义和实用价值。

复杂背景下的文本定位的应用：

(1)实时车牌定位。通过摄像头捕获高速公路上的车牌图像，经过车牌识别系统进行分析和处理，可以实时对交通情况进行监督，实时识别出交通事故涉及车辆的，提高运输监管部门的工作效率。