LLM-Lab
easgari[at]hbku[dot]edu.qa
Ehsaneddin Asgari’s research focuses on developing multimodal and multilingual language technologies, with particular emphasis on document understanding and multimodal retrieval-augmented generation for Arabic and languages of the MENA region. His work is conducted within the Arabic Language Technologies group at the Qatar Computing Research Institute (QCRI).
Research Focus
Our work is organized around two pillars:
1. Multimodal Document Understanding
- Optical character recognition (OCR) for diverse scripts and document types
- Document layout analysis and reasoning over text, tables, and figures
- Retrieval-augmented generation (RAG) — retrieval, reasoning, and generation components
- Domain-specific RAG systems (e.g., Quranic studies, legal texts, historical archives)
2. Multimodal Multidialectal Arabic Language Technologies
- Arabic natural language processing (NLP) across dialects and registers
- Language resources for the dialects and cultures of the MENA region
- Language technologies for digital humanities
- NLP for digital health and well-being
Other Areas of Interest
- Capability-oriented LLM benchmarking and enhancement
- AI for multimedia art and creative applications
- Protein language modeling and antimicrobial resistance prediction using multimodal language models
- NLP for MENA region languages beyond Arabic
Collaborations & Opportunities
We collaborate with a global network of researchers and welcome undergraduate and graduate students in Computer Science and Linguistics for research internships (remote or on-site).
Interested? Send your CV to discuss potential opportunities.
news
| Apr 19, 2026 | Two papers arising from our work on Fanar were accepted to Findings of ACL 2026: MorphBPE: Morphology-Aware Tokenization for Efficient LLM Training and Almieyar-Oryx-BloomBench: A Bilingual Multimodal Benchmark for Cognitively Informed Evaluation of Vision-Language Models. |
|---|---|
| Mar 29, 2026 | SilkRoadNLP workshop held at EACL 2026 in Rabat, Morocco — our initiative on NLP for the Iranian family of languages and cultures along the historical Silk Road. |
| Mar 27, 2026 | We are honored to receive the Best Resource Paper Award at EACL 2026 in Rabat, Morocco! |
| Mar 24, 2026 | Arabic NLP School held at EACL 2026 in Rabat, Morocco — a full-day school on foundational and advanced topics in Arabic language technologies, with over 120 participants selected from 300+ applications. |
| Jan 03, 2026 | Three papers accepted at EACL 2026 — two in the main conference and one in Findings. Congratulations to all co-authors! |