Lemmatization

StemmedLemma
Caringcarcare
runningrunrun
fliesflifly
happilyhappihappily *
geesegeesgoose **
historicalhistorihistorical
betterbetterbetter
largestlargestlarge
jumpsjumpjump
quicklyquickliquickly *
easilyeasilieasily *

Exhibit 25.12 Stemming and Lemmatization.
* lemmatization retains the base form for adverbs
** plural to singular

Lemmatization is a more advanced linguistic technique that reduces words to their base or dictionary form, known as a lemma. Unlike stemming, which simply cuts off word endings, lemmatization considers the word’s part of speech and grammatical context, leading to more meaningful results. For example, “caring” is reduced to “care” through lemmatization, whereas stemming might produce “car”. While lemmatization is computationally more intensive than stemming, it is often preferred in applications where precision is paramount.

Comparison between stemming and lemmatization is provided in Exhibit 25.12. Stemming is used in case of large dataset where performance is an issue. It is commonly used in sentiment analysis. On the other hand, lemmatization is widely used in Chatbots and human-answering systems where accuracy is of greater importance.


Previous     Next

Use the Search Bar to find content on MarketingMind.