Aware of the obstacles in the case of South Slavic WaC approach, we took advantage of the already operating
Croatian academic online spellchecker Hascheck7 and started collecting Croatian n-grams, n = 1, 2,…, 5, a year after
the appearance of Brants' and Franz's English n-gram system2. The original intention was to use n-grams as the basis
for upgrading Hascheck into a contextual spellchecker, but in the course of development it became clear that the
results are much more broadly applicable. From a respectable amount of data collected so far, we succeeded in
developing a consistent, maintainable and upgradable n-gram system, comparable in size to the largest Google ngram
systems.
Aware of the obstacles in the case of South Slavic WaC approach, we took advantage of the already operatingCroatian academic online spellchecker Hascheck7 and started collecting Croatian n-grams, n = 1, 2,…, 5, a year afterthe appearance of Brants' and Franz's English n-gram system2. The original intention was to use n-grams as the basisfor upgrading Hascheck into a contextual spellchecker, but in the course of development it became clear that theresults are much more broadly applicable. From a respectable amount of data collected so far, we succeeded indeveloping a consistent, maintainable and upgradable n-gram system, comparable in size to the largest Google ngramsystems.
การแปล กรุณารอสักครู่..