WSD Implementation for Processing Improvement of Structured Documents

Madalina Zurini

Abstract


The term of word sense disambiguation, WSD, is introduced in the context of text document processing. A knowledge based approach is conducted using WordNet lexical ontology, describing its structure and components used for the process of identification of context related senses of each polysemy words. The principal distance measures using the graph associated to WordNet are presented, analyzing their advantages and disadvantages. A general model for aggregation of distances and probabilities is proposed and implemented in an application in order to detect the context senses of each word. For the non-existing words from WordNet, a similarity measure is used based on probabilities of co-occurrences.

Keywords


WordNet, supervised classification, word similarity, context similarity, ontology

Full Text:

PDF

References


Trausan-Matu, S. “Inteligenta artificiala”, 2004, Available online at : http://www.racai.ro/~trausan/ia.pdf

WordNet. A lexical database for English, Available online at: http://wordnet.princeton.edu/wordnet/related-projects/

Hessami, E., Mahmoudi, F., Jadidinejad, H. „Unsupervised Graph-based Word Sense Disambiguation Using lexical relation of WordNet”, International Journal of Computer Science Issues, Vol. 8, Nr. 3, 2011, pg. 225-230, ISSN 1694-0814

WordNet Statistics: Available online at: http://wordnet.princeton.edu/wordnet/man/wnstats.7WN.html

Gonzalez, A., Rigau, G., Castillo, M. „A graph-based method to improve WordNet Domains”, Proceeding CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing, Vol 1, 2012, pg. 17-28, ISBN 978-3-642-28603-2

Elberrichi, Z., Rahmoun, A., Bentaalah, M.A. „Using WordNet for Text Categorization”, The International Arab Journal of Information Technology, Vol. 5, Nr. 1, 2008, pg. 16-24, ISSN 1683-3198

Passos, A., Wainer, J. „Wordnet-based metrics do not seem to help docucment clustering”, 2009, Available online at: http://www.ic.unicamp.br/~tachard/docs/wncluster.pdf

Pedersen, T., Patwardhan, S., Michelizzi, J. „WordNet::Similarity – Measuring the Relatedness of Concepts”, Proceeding HLT-NAACL--Demonstrations '04 Demonstration Papers at HLT-NAACL, May, 2004, Boston, pg. 38-41

Budanitsky, A., Hirst, G. „Evaluating WordNet-based Measures of Lexical Semantic Relatedness”, Journal Computational Linguistics, Vol. 32. Nr. 1, 2006, pg. 13-47, ISSN 2180-1266

Peng, Q., Zhao, L., Yu, Y., Fang, W. „A New Measure of Word Semantic Similarity based on WordNet Hierarchy and DAG Theory”, International Conference on Web Information Systems and Mining, 2009, pg. 181-185, ISBN 978-0-7695-3817-4

Blanchard, E., Harzallah, M., Briand, H., Kuntz, P. ”A typology of ontology-based semantic measures”, Proceeding of EMOI-INTEROP 05, Portugal, June 2005

Buhanitzky, A., Hirst, G. ”Evaluating WordNet-based Measures of Lexical Semantic Relatedness”, Journal Computational Linguistics, Vol. 32, Nr. 1, 2006, pg. 13-47, ISSN 1530-9312

Yang, D., Powers, D.M.W. „Measuring Semantic Similarity in the Taxonomy of WordNet”, 28th Australasian Computer Science Conference, Newcastle, Australia, 2005, pg. 315-322

Lewis, W.D. “Measuring Conceptual Distance Using WordNet: The Design of a Metric for Measuring Semantic Similarity”, Language in Cognitive Science, 2001, pg. 9-16, Available online at: http://coyotepapers.sbs.arizona.edu/CPXII/Lewis.pdf

Richardson, R., Smeaton, A., Murphy, J. „Using WordNet as a Knowledge Base for Measuring Semantic Similarity between Word”, Technical Report, Working paper CA-1294, School of Computer Applications, Dublin City University, 1994

Kamali, S. „Some Experiments in Word Sense Disambiguation”, 2001, Available online at: https://cs.uwaterloo.ca/~s3kamali/courses/word-sense-disambiguation.pdf

Xiaobin, L., Szpakowicz, S., Matwin, S. „A WordNet-based Algorithm for Word Sense Disambiguation”, Proceedings of the 14th International Joint Conference on Artificial Intelligence, 1995, pg. 1368—1374

Resnik, P. „Disambiguating Noun Grouping with Respect to WordNet Senses”, Natural Language Processing Using Very Large Corpora Text, Speech and Language Technology, Vol. 11, 1999, pg. 77-98, ISBN 978-90-481-5349-7

Boyd-Graber, J., Fellbaum, C., Osherson, D., Schapire, R. „Adding Dense, Weighted Connections to WordNet”, 2005, Available online at: https://wordnet.princeton.edu/wordnet/publications/jbj-jejufellbaum.pdf


Refbacks

  • There are currently no refbacks.


Journal of Mobile, Embedded and Distributed Systems (JMEDS) ISSN: 2067 – 4074 (online)