Kaspersky Lab received the U.S. patent for its innovative technology to identify the electronic messages text as spam. Spam
causing much damage to both companies and consumers. Unwanted e-mail messages often contain fraudulent offers, malicious attachments or links to infected websites.
One of the most popular and most effective ways to combat unwanted e-mail is to classify messages based on whether they contain keywords and phrases typical of spam. This practice not only allows you to configure the system to be blocked by new types of spam, but also provides a high detection rate with a minimum number of false hits.
Patented electronic text messages are classified based on a hierarchical list of categories. Each category is defined by a set of keywords and text templates. An incoming message is categorized as follows: first, its weight is calculated with respect to each category containing keywords found in e-mail. Then determined the degree of similarity to each of the templates. If the message contains a number of keywords or is sufficiently similar to the one of the templates, is classified to a category, including spam.
News Categories can be added manually to indicate keywords and creating templates. In addition, each of them can be divided into subcategories, which will provide a more detailed classification. Text messages can also be pre-processed using techniques such as automatic language detection, removing frequently used words and filtering noise.
0 comments:
Post a Comment