Enhance Data Similarity Using a Fuzzy Approach

Main Article Content

Dhuha Kh. Altmemi, Imad S. Alshaw

Abstract

Text similarity is critical in a variety of applications, including word processing, signal processing, imagery, data mining, wireless sensor networks, etc., where text similarity measurements can detect whether texts are lexical or semantic similar. Semantic text similarity is the term that uses to describe similarities based on meaning. Although this function is very challenging, it remains an active subject of study due to the complexities of natural language. The second type is lexical similarity whereby this type can be used to eliminate repetition by grouping similar texts together provided that two texts are very similar. It is important to remember that traditional text similarity approaches only look at the actual words in a phrase to compare two texts. Depending on the use case, it’s easier to build and manage and offers a better trade-off. This paper examines current work on text similarity and divides it into four categories. Techniques based on strings, Corpus, knowledge, or hybrid similarities, these categories are all comparable. There are also examples of different combinations of these techniques for matching text and finding similarities between two texts. A smart method is proposed to find out the similarity between two texts called the fuzzy data similarity (FDS), and to prove the efficiency of the proposed method, it was compared with the most famous methods, where the results showed an accuracy of the FDS about 93%.

Article Details

Section
Articles