Turkish-lemmatizer

Turkish Lemmatizer is used for finding root form of Turkish words.

Introduction

Turkish Lemmatizer is used for finding stem/root form of Turkish words

In morphologically complex languages such as Turkish, the stemming process is a difficult task. Therefore, usage of a lemmatizer is a wise solution. The approach that we take during lemmatization process is to use a predefined stem list and pick the one that has the longest matched from the beginning of word.

Turkish Lemmatizer uses "Longest Matched Stemming" algorithm. But besides, it handles some of the turkish word formation that is commonly seen in turkish words.

In this lemmatization library, lemmatization process performance totally related with the supplied stem list.

Obtaining Library

turkish-lemmatizer-v0.0.2/
├── lib
│   └── turkish-lemmatizer-0.0.2.jar
├── LICENSE
└── README.md

Lemmatizer Usage

For usage please visit wiki page.

Personal Request from developer

If you use this library in a scientific project, please provide feedback. Such feedbacks can be used to improve algorithm used in the lemmatizer.

License

This project has been licensed under Apache License v2.0

Support or Contact

Having trouble with lemmatizer? you may email your problems to me baturman (at) gmail.com. If you find issue, please report to issues section in the github project.