WSD Implementation for Processing Improvement of Structured Documents
Keywords:
WordNet, supervised classification, word similarity, context similarity, ontologyAbstract
The term of word sense disambiguation, WSD, is introduced in the context of text document processing. A knowledge based approach is conducted using WordNet lexical ontology, describing its structure and components used for the process of identification of context related senses of each polysemy words. The principal distance measures using the graph associated to WordNet are presented, analyzing their advantages and disadvantages. A general model for aggregation of distances and probabilities is proposed and implemented in an application in order to detect the context senses of each word. For the non-existing words from WordNet, a similarity measure is used based on probabilities of co-occurrences.References
Trausan-Matu, S. “Inteligenta artificiala”, 2004, Available online at : http://www.racai.ro/~trausan/ia.pdf
WordNet. A lexical database for English, Available online at: http://wordnet.princeton.edu/wordnet/related-projects/
Hessami, E., Mahmoudi, F., Jadidinejad, H. „Unsupervised Graph-based Word Sense Disambiguation Using lexical relation of WordNet”, International Journal of Computer Science Issues, Vol. 8, Nr. 3, 2011, pg. 225-230, ISSN 1694-0814
WordNet Statistics: Available online at: http://wordnet.princeton.edu/wordnet/man/wnstats.7WN.html
Gonzalez, A., Rigau, G., Castillo, M. „A graph-based method to improve WordNet Domains”, Proceeding CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing, Vol 1, 2012, pg. 17-28, ISBN 978-3-642-28603-2
Elberrichi, Z., Rahmoun, A., Bentaalah, M.A. „Using WordNet for Text Categorization”, The International Arab Journal of Information Technology, Vol. 5, Nr. 1, 2008, pg. 16-24, ISSN 1683-3198
Passos, A., Wainer, J. „Wordnet-based metrics do not seem to help docucment clustering”, 2009, Available online at: http://www.ic.unicamp.br/~tachard/docs/wncluster.pdf
Pedersen, T., Patwardhan, S., Michelizzi, J. „WordNet::Similarity – Measuring the Relatedness of Concepts”, Proceeding HLT-NAACL--Demonstrations '04 Demonstration Papers at HLT-NAACL, May, 2004, Boston, pg. 38-41
Budanitsky, A., Hirst, G. „Evaluating WordNet-based Measures of Lexical Semantic Relatedness”, Journal Computational Linguistics, Vol. 32. Nr. 1, 2006, pg. 13-47, ISSN 2180-1266
Peng, Q., Zhao, L., Yu, Y., Fang, W. „A New Measure of Word Semantic Similarity based on WordNet Hierarchy and DAG Theory”, International Conference on Web Information Systems and Mining, 2009, pg. 181-185, ISBN 978-0-7695-3817-4
Blanchard, E., Harzallah, M., Briand, H., Kuntz, P. ”A typology of ontology-based semantic measures”, Proceeding of EMOI-INTEROP 05, Portugal, June 2005
Buhanitzky, A., Hirst, G. ”Evaluating WordNet-based Measures of Lexical Semantic Relatedness”, Journal Computational Linguistics, Vol. 32, Nr. 1, 2006, pg. 13-47, ISSN 1530-9312
Yang, D., Powers, D.M.W. „Measuring Semantic Similarity in the Taxonomy of WordNet”, 28th Australasian Computer Science Conference, Newcastle, Australia, 2005, pg. 315-322
Lewis, W.D. “Measuring Conceptual Distance Using WordNet: The Design of a Metric for Measuring Semantic Similarity”, Language in Cognitive Science, 2001, pg. 9-16, Available online at: http://coyotepapers.sbs.arizona.edu/CPXII/Lewis.pdf
Richardson, R., Smeaton, A., Murphy, J. „Using WordNet as a Knowledge Base for Measuring Semantic Similarity between Word”, Technical Report, Working paper CA-1294, School of Computer Applications, Dublin City University, 1994
Kamali, S. „Some Experiments in Word Sense Disambiguation”, 2001, Available online at: https://cs.uwaterloo.ca/~s3kamali/courses/word-sense-disambiguation.pdf
Xiaobin, L., Szpakowicz, S., Matwin, S. „A WordNet-based Algorithm for Word Sense Disambiguation”, Proceedings of the 14th International Joint Conference on Artificial Intelligence, 1995, pg. 1368—1374
Resnik, P. „Disambiguating Noun Grouping with Respect to WordNet Senses”, Natural Language Processing Using Very Large Corpora Text, Speech and Language Technology, Vol. 11, 1999, pg. 77-98, ISBN 978-90-481-5349-7
Boyd-Graber, J., Fellbaum, C., Osherson, D., Schapire, R. „Adding Dense, Weighted Connections to WordNet”, 2005, Available online at: https://wordnet.princeton.edu/wordnet/publications/jbj-jejufellbaum.pdf
Downloads
Published
How to Cite
Issue
Section
License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
- The author(s) is responsible for the correctness and legality of the paper content.
- Papers that are copyrighted or published will not be taken into consideration for publication in JMEDS It is the author(s) responsibility to ensure that the paper does not cause any copyright infringements and other problems.
- It is the responsibility of the author(s) to obtain all necessary copyright release permissions for the use of any copyrighted materials in the paper prior to the submission.
- The Author(s) retains the right to reuse any portion of the paper, in future works, including books, lectures and presentations in all media, with the condition that the publication by JMEDS is properly credited and referenced.
JMEDS articles by Journal of Mobile, Embedded and Distributed Systems (JMEDS) is licensed under a Creative Commons Attribution 4.0 International License.
Based on a work at http://jmeds.eu.
Permissions beyond the scope of this license may be available at http://jmeds.eu/index.php/jmeds/about/submissions#copyrightNotice.