Named Entity Recognition Based Approach for Automatic Turkish Financial Document Verification


Toprak A., Turan M.

10th International Conference on Computer Science and Engineering, UBMK 2025, İstanbul, Türkiye, 17 - 21 Eylül 2025, ss.315-320, (Tam Metin Bildiri) identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Doi Numarası: 10.1109/ubmk67458.2025.11206872
  • Basıldığı Şehir: İstanbul
  • Basıldığı Ülke: Türkiye
  • Sayfa Sayıları: ss.315-320
  • Anahtar Kelimeler: document verification, financial reports, named entity recognition, spell checker
  • İstanbul Gelişim Üniversitesi Adresli: Evet

Özet

This study proposes a hybrid Named Entity Recognition based method for automatic verification of Turkish financial documents. A fine-tuned Bidirectional Encoder Representations from Transformers model is used to extract named entities, supported by rule-based regex for types not covered by the model, such as currency codes and emails. Similarity between summary and full-text sentences is calculated using Simhash, and sentence-level entity matches are used to determine verification accuracy. Spell checker integration is also evaluated. Two datasets - financial and sports - were used to evaluate the method, achieving 91.6% and 79% average verification accuracy, respectively. The results demonstrate that combining machine learning with domain-specific rules can significantly improve verification performance, particularly in low-resource Turkish natural language processing settings.