论文题目:Wordnet在图像语义分析中的应用——计算机应用技术在职硕士毕业论文
论文语种:中文
您的研究方向:计算机应用技术
是否有数据处理要求:是
您的国家:中国
您的学校背景:国内一般重点
要求字数:40000
论文用途:硕士毕业论文
是否需要盲审(博士或硕士生有这个需要):否
补充要求和说明:附件的论文提纲是个大概,需要润色
Wordnet在图像语义分析中的应用——计算机应用技术在职硕士毕业论文
摘要
传统上基于内容图像检索(content-based image retrieval, CBIR)系统因存在着使用者查询与图像特征间的语义鸿沟,所以通常无法满足使用者的需求。语义鸿沟为CBIR系统主要的缺点,在本文中提出一个方法去连结这样的语义鸿沟。我们将根据萃取网页中图像周围文字的语义来克服CBIR系统语义鸿沟的缺陷。
本论文提出一新的Wordnet语义学习方法,利用一群已标记的图像产生可能的语义讯息来侦讯图片中的主要语义物件,并据以应用到内容导向图像检索的应用上。在本方法中,将资料库的图像分为两类--已标示(labeled)语义群与未标示(unlabeled)群图像,对每一个已标记的图像我们设计一基于低阶特征语义学习模型。资料库里的所有图像都会先经过图像切割方法切割成多个区块,进而抽取代表这些区块的三种不同形态的低阶视觉特征(颜色、形状、纹理),根据这些低阶视觉特征的统计资料建造出语义侦测模型来预测分析隐藏在资料库里的语义信息。由于人类对于图像中所包含的语义特征是很主观的,所以使用具人工标记的图像的低阶特征所建立的统计模型来进行图像做注解常有模糊不清的问题。为解决这个问题,本论文提出一个区域权重估测演算法,选取具最大的语义信息重要区域,抽取其特征后,进行隐含语义内涵区域式内容导向图像检索。在检索的过程中,只有重要区域的特征才用来当作计算图像间语义距离的特征向量,此语义学习架构对内容导向图像检索系统提供了一个连结高阶语义概念与低阶图像特征的桥梁。实验结果显示我们所提出的方法与其他相似的语义学习方法,在效能上有更好的表现。
本文运用Wordnet作为语义分析的核心,来处理图像周围的文字以便撷取图像的语义信息。某些图像中所隐含的语义能在语义分析后被挖掘出来,并可据此语义进行图像检索。同时在本文中也定义了一套评估标准来评估语义图像检索的成效。
关键词:语义侦测;智能导向图像检索;视觉图层;图像切割;语义学习
Abstract
Traditional content-based image retrieval (CBIR) systems often fail to meet a user’s need due to the ‘semantic gap’ existed between the extracted features by the systems and the user’s query. In this paper we propose an approach to bridge the semantic gap which is the major deficiency of CBIR systems. We conquer such deficiency by extracting semantics of an image from the environmental texts around it.
In this thesis, a new Wordnet semantic learning method to detect semantic region for image retrieval from a given amount of labeling effort is proposed. In our approach, the database images are classified into two classes –the labeled class and the unlabeled class. Form images in the labeled class, we construct a concept detection to detect the important regions in each image based on the statistical information of a semantic class. All the images in the database are segmented into multiple disjoint regions, each of them is represented by three type of low-level visual features ( i.e. color, shape, and texture). With this representation a region weighting model based on the statistical information of low-level visual features is predicted to analyze semantic concepts hidden in the database. One key obstacle in applying statistical methods to discover the hidden semantic concepts for annotating images in the amount of manually-labeled images is normally insufficient. For images that have not been annotated, the learning algorithm estimates their important regions whose low-level features are then extracted to retrieve semantic all similar image s form the test data base. Experimental results show that the performance of the proposed method is excellent as compared with that of simulated traditional content-based image retrieval.
We applied a Semantic analysis process, which adopts the Wordnet learning algorithm as a kernel, on the environmental texts of an image to extract the semantic information from this image. Some implicit semantic information of the images can be discovered after the Wordnet process. We also define a semantic relevance measure to evaluate these semantic-based image retrieval tasks.
Keywords: Semantic detection; smart-oriented image retrieval; visual layers; images cutting; semantic learning
目 录
摘要 i
Abstract ii
第一章 绪论 1
1.1 研究背景与意义 1
1.2 研究目的 3
1.3 国内外研究现状 5
1.3.1 图像语义 5
1.3.2 图像语义属性 7
1.4 研究方法 13
1.5 本文章节安排 13
1.6 本章小结 14
第二章 Wordnet相关理论研究 15
2.1 Word-net简介 15
2.2 Wordnet的功能实现 16
2.3 树的构建 18
2.3.1 动词(Verb) 18
2.3.2 名词(Noun) 20
2.3.3 形容词(Adjective) 21
2.4 语义合并关系 21
2.5 语义范畴的关系及实验 23
2.6 本章小结 25
第三章 Label-Me的文献探讨 27
3.1 labelme简介 27
3.2 Annotation注释 27
3.2.1 原始图像 28
3.2.2 xml文档 29
3.2.3 文档里的规范 30
3.3 目标识别 31
3.4 场景识别 35
3.4.1 面向对象的方法(object) 36
3.4.2 多边形(polygons) 37
3.4.3 图层分割(Multiresolution segmentation) 38
3.5 本章小结 40
第四章 基于Label-Me的图像语义统计分析 44
4.1 labelme中的Wordnet真实词汇的关系 44
4.2 基于Label-Me的图像语义数据库模型建立 48
4.2.1 原始的分割 48
4.2.2 语义合并的分割 51
4.3 图像库统计分析 53
4.3.1 原始语义词频统计 55
4.3.2 合并后语义词频统计 57
4.4 图像语义相似度 58
4.4.1 图像语义相似度介绍 58
4.4.2 图像语义相似度实验及结果讨论 59
4.5 本章小结 61
第五章 结论与建议 62
参考文献 64
第一章 绪论
1.1 研究背景与意义
随着科技不断的进步,数码相机、数码摄影机、照相手机等科技产品的普及,数码图像随手可得,使得人们对多媒体信息如文字、图像、音讯及视频的储存需求日益增加。因此,多媒体资料库系统也逐渐朝向包含多种检索机制的整合,除了关键字外更加入许多利用声音、范例、形状、颜色、纹理、空间结构与移动变化等等的检索条件。传统上,大部分的系统利用文字特征如档案名称、标题与关键字等来注释图像与检索图像。然而当应用于相当大的资料量时,关键字的时用就变得相当累赘麻烦,不仅人工加入关键字麻烦,当资料量大时,关键字的使用便无法合适地表达资料库图像的语义。目前已有许多基于内容导向图像检索系统的Wordnet相关文献被提出[1]。为了能够依据图像内容作为检索条件,颜色、形状、纹理等图像低阶特征已被广泛地利用作为检索与索引用。虽然Wordnet可以有相当多的应用,但在运用时还是有相当多的困难等着本研究去挑战与解决,包括了图像中的对象切割、抽取符合人类语义的低阶特征与基于这些特征的检索机制等。[2]面对这些挑战,完全基于内容导向的图像检索技巧无法得到相当好的效能表现,使用语义学习的方法无论在图像检索、图像注释或多媒体计算上都能够更进一步且有效的管理资料库。
参考文献
[1] Liu Y, Zhang D, Lu G, et al.A Survey of Content-Based Image Retrieval with High-Level Semantics. Pattern Recognition . 2007
[2] Junath B, Rainer J, Vinod V, et al.Color and TextureDescriptors. IEEE Transaction on circuits and systems forvideo technology . 2001
[3] Xirong Li, Le Chen, Lei Zhang, Fuzong Lin, WeiYing Ma.Image Annotation by Large-scale Content-based Image Retrieval. processing of MM‘06 . 2006
[4] Lew, Sebe, Djeraba, Jain.Content-based Multimedia Information Retrieval: State of the Art and Challenges. ACM Transactions on Multimedia Computing, Communications, and Applications . 2006
[5] B.Micusik, T.Pajdla.Multi-label image segmentation via max-sum solver. processing of CVPR 07 . 2007
[6] J.R .Bach etc.The virage image search engine:An open frame work for image management. Proc.SPIE: Storage and Retrieval for Still Image and Video Databases IV . 1996
[7] Ricardo Baeza-Yates, Berthier Ribeiro-Neto.Modern Information Retrieval. . 1999
[8] J.Chen, C.Tang.Spatio-temporal markov random field for video denoising. processing of CVPR‘07 . 2007
[9] L.Cao, J.Luo, H.Kautz, T.Huang.Annotating collections of photos using hierarchical event and scene models. processing of CVPR‘08 . 2008
[10] S.K.Chang, Q.Y.Shi, C.Y.Yan.Iconic indexing by 2-D strings. IEEE Trans, on Pattern Analysis and Machine Intellience . 1987
[11] J.Ding, L.Gravano, N.Shivakumar.Computing Geographical Scopes of Web Resource. Proc.26th Intel.Conference on Very Large Data Bases (VLDB 2000) . 2000
[12] Douglas L.Vail, Manuela M.Veloso, John D.Lafferty.Conditional random fields for activity recognition. Proc.Of AAMAS‘07 . 2007
[13] Hearst MA, Pedersen JO.Reexamining the cluster hypothesis: Scatter/gather on retrieval results. Proceedings of the 19th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’96) . 1996
[14] M.Ohta, H.Narita, S.Ohno.Overlapping cluatering method using local and global importance of feature terms at NTCIR-4 Web task Working notes of NTCIR(NⅡ-NACSIS Test Collection for IR system)-4 Vol.supl.l. . 2004
[15] R.Pedersen, S.Patwardhan, J.Michelizzi.Wordnet:similarity - measuring the relatedness of concepts. processing of AAAI‘04 . 2004
[16] G.Qi, X.Hua, Y.Rui, J.Tang, T.Mei, H.Zhang.Correlative multi-label video annotation. processing of ACM SIGMMo .
[17] R.Radev, Hongyan Jing, Malgorzata Budzikowska.Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies. ANLP/NAACL 2000 Workshop . 2000
[18] SAOO], R Swan, J .Allan.TimeMine: Visualizing Automatically Constructed Timelines. Proc.of 23rd Annual International ACM SIGIR . 2000
[19] H.M.Sanderson, M.D.Dunlop.Image Retrieval by Hypertext Links. Proceedings of SIGIR‘97 . 1997
[20] Stanislaw Osinski, Jerzy Stefanowski, Dawid Weiss.Lingo: Search results clustering algorithm based on Singular Value Decomposition. Intelligent Information Systems Conference . 2004
[21] 王梅. 基于多标签学习的图像语义自动标注研究[D]. 复旦大学 2008
[22] 黄鹏. 基于文本和视觉信息融合的Web图像检索[D]. 浙江大学 2008
[23] 虎晓红. 用于图像检索的语义标注技术的研究[D]. 中国矿业大学(北京) 2010
[24] 黄健斌. 基于条件概率图模型的Deep Web数据抽取与集成研究[D]. 西安电子科技大学 2007
[25] 禇一平. 基于条件随机场模型的视频目标分割算法研究[D]. 浙江大学 2007
[26] 何儒汉. Web图像的多模融合检索研究[D]. 华中科技大学 2007
[27] J.Tang, X.-S.Hua, G.-J.Qi, M.Wang, T.Mei, X.Wu.Structure-sensitive manifold ranking for video concept detection. processing of ACM MM‘07 . 2007
[28] B.Wang, Z.Li, N.Yu, M.Li.Image annotation in a progressive way. processing of ICME‘07 . 2007
[29] Allison Woodruff, Andrew Faulring, Ruth Rosenholtz, Julie Morrison, Peter Pirolli.Using Thumbnails to Search the Web. processing of SIGCHI‘01 . 2001
[30] Hua-Jun Zeng, Qi-Cai He Zheng, Chen Wei-Ying, Ma Jinwen Ma.Learning to Cluster Web Search Results. SIGIR‘04 . 2004
[31] H.J.Zeng, Q.C.He, Z.Chen, W.Y.Ma, J.Ma.Learning to cluster Web search results. processing of SIGIR‘04 . 2004
[32] X.Zhou, M.Wang, Q.Zhang, J.Zhang, B.Shi.Automatic image annotation by an iterative approach:incorporating keyword correlations and region matching. processing of CIVR‘07 . 2007
[33] Jun Zhu, Zaiqing Nie, Jirong Wen, Bo Zhang, Weiing Ma.D Conditional random fields for Web information extraction. Proc.of the 22nd Int‘l Conf.on Machine Learning . 2005
[34] Jun Zhu, Zaiqing Nie, Bo Zhang, Jirong Wen.Dynamic hierarchical Markov random fields and their applications to Web data extraction. Proc.24th Int‘l Conf.on Machine Learning . 2007
[35] Flickner M, Sawhney H, Niblack W, et al.Query by image and video content: the QBIC system. IEEE Computer . 1995
[36] 代劲. 云模型在文本挖掘应用中的关键问题研究[D]. 重庆大学 2011
[37] 张奇. 细颗粒度情感倾向分析若干关键问题研究[D]. 复旦大学 2008
[38] 芮晓光. 真实世界环境下的自动图像标注方法研究[D]. 中国科学技术大学 2010
[39] 黄世国. 基于图像的昆虫识别关键技术研究[D]. 西北大学 2008
[40] Gudivada V N, Raghavan V V.Design and Evaluation of Algorithms for Image Retrieval by Spatial Similarity. ACM Transactions on Information Systems . 1995
[41] 毛宇星. 关联规则挖掘在分类数据领域的扩展性研究[D]. 复旦大学 2010
[42] 陈华辉. 基于遗忘特性的数据流概要结构及其应用研究[D]. 复旦大学 2008
[43] 段江娇. 基于模型的时间序列数据挖掘[D]. 复旦大学 2008
[44] 王梅. 基于多标签学习的图像语义自动标注研究[D]. 复旦大学 2008
[45] 余平. 无线数据广播调度与索引技术研究[D]. 复旦大学 2008
[46] 周皓峰. 关联规则挖掘的拓展性研究[D]. 复旦大学 2003
[47] 周向东. 图像数据库检索中的关键技术研究[D]. 复旦大学 2003
[48] 陈良刚. 移动计算环境中位置相关数据管理[D]. 复旦大学 2003
[49] 陶春. 半结构化数据集成系统中的查询处理研究[D]. 复旦大学 2004
[50] 庞引明. 基于结构化联接的XML查询模式匹配关键技术研究[D]. 复旦大学 2004
[51] 张守志. Rough集中若干问题的研究[D]. 复旦大学 2004
[52] 张绍华. 网格工作流关键技术研究[D]. 复旦大学 2004
[53] Haralick RM, Shapiro LG.Computer and Robot Vision. . 1992
[54] Lafferty J, McCallum A, Pereira F.Conditional random fields: probabilistic models for segmenting and labeling sequence data. Proceedings of the Eighteenth International Conference on Machine Learning . 2001
[55] Mehrotra R, Gary J.Similar shape retrieval in shape data management. IEEE Computer . 1995
[56] Manjunath B S, Ma W Y.Texture features for browsing and retrieval of image data. IEEE Transactions on Pattern Analysis and Machine Intelligence . 1996
[57] Jinbo Bi, Yixin Chen, James Z Wang.A sparse support vector machine approach to region-based image categorization. Proc. International Conference on Computer Vision and Pattern Recognition . 2005
[58] Kobus Barnard, and David Forsyth.Learning the semantics of words and pictures. . 2001
[59] Cai Deng, Yu Shipeng, Wen Jirong, et al.VIPS:A Vision-based Page Segmentation Algorithm. Beijing Microsoft Research, Technical Report:MSR-TR-2003 -79 . 2003
[60] Y.X. Chen, J.B. Bi, J.Z. Wang.MILES: Multiple-Instance Learning via Embedded Instance Selection. IEEE Trans. on Pattern Analysis and Machine Intellience . 2006
[61] G. Carneiro,A.B. Chan,P.J. Moreno,N. Vasconcelos.Supervised Learning of Semantic Classes for Image Annotation and Retrieval. IEEE Trans. on Pattern Analysis and Machine Intellience . 2007
[62] P. Duygulu,K. Barnard,J. F. G de Freitas,D. A. Forsyth.Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary. Proc. of European Conference on Computer Vision . 2002
[63] S. L. Feng,R. Manmatha,V . Lavrenko.Multiple Bernoulli Relevance Models for Image and Video Annotation. Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition .
[64] HuaMin Feng,Rui Shi,Tat-Seng Chua.A bootstrapping framework for annotating and retrieving WWW images. ACM Multimedia . 2004 [65] E.Gabrilovich,S.Dumais,E.Horvitz.Newsjunkie: Providing Personalized Newsfeeds via Analysis of Information Novelty. Proc. of the 13 th International WWW Conference . 2004
[66] Y.L. Gao,J.P. Fan,X.Y. Xue,R. Jain.Automatic Image Annotation by Incorporating Feature Hierarchy and Boosting to Scale up SVM Classifiers. Proc. of ACM International Conference on Multimedia . 2006
[67] Z Hua,X.Wang,Q.Liu,H.Liu.Semantic Knowledge Extraction and Annotation for Web Images. Proc. of ACM International Conference on Multimedia . 2005
[68] 陈世平. 面向覆盖网典型应用的对等计算研究[D]. 复旦大学 2006
[69] 史玉良. Web服务合成的若干关键技术研究[D]. 复旦大学 2006
[70] 严和平. 基于推理的访问控制与审计技术研究[D]. 复旦大学 2006
[71] 李晓荣. 移动事务管理中的若干关键问题研究[D]. 复旦大学 2006
[72] 刘兵. 时间序列与聚类挖掘相关技术研究[D]. 复旦大学 2006
[73] 刘炜. 基于本体的数字图书馆语义互操作[D]. 复旦大学 2006
[74] 张军旗. 支持最近邻查找的高维空间索引[D]. 复旦大学 2007
[75] Z.Hua,C.Wang,X.Xie,H.Lu,W.-Y.Ma.Automatic Annotation of Location Information for WEB Images. Proc. International Conference on Mulitimedia and Expo(ICME) . 2005
[76] R. Jin,J. Y. Chai,L. Si.Effective Automatic Image Annotation via A Coherent Language Model and Active Learning. Proc. of International Conference on ACM Multimedia . 2004
[77] Jeon J,Lavrenko V,Manmatha R.Automatic Image Annota- tion and Retrieval Using Cross-Media Relevance Mod- els. The Twenty-Sixth Annual International ACM SIGIR Conference on Research and Development inInformation Retrieval . 2003
[78] 王鹏. 数据流上的分类算法的研究[D]. 复旦大学 2007
[79] 许建军. 对结构化和半结构化数据的关键字搜索研究[D]. 复旦大学 2007
[80] 刘方方. Web服务合成与可用性的若干关键技术研究[D]. 复旦大学 2007