Publications
Publications by categories in reversed chronological order.
2026
- MorphBPE: A Morpho-Aware Tokenizer Bridging Linguistic Complexity for Efficient LLM Training Across MorphologiesIn Findings of the Association for Computational Linguistics (ACL), 2026
- BloomBench: A Bilingual Multimodal Benchmark for Cognitively Informed Evaluation of Vision-Language ModelsIn Findings of the Association for Computational Linguistics (ACL), 2026
- HarfoSokhan: A Comprehensive Parallel Dataset for Transitions between Persian Colloquial and Formal VariationsIn European Chapter of the Association for Computational Linguistics (EACL) (Main), 2026
- Detecting Subtle Biases: An Ethical Lens on Underexplored Areas in AI Language Models BiasesIn European Chapter of the Association for Computational Linguistics (EACL) (Main), 2026
- MEENA (PersianMMU): Multimodal-Multilingual Educational Exams for N-level AssessmentIn European Chapter of the Association for Computational Linguistics (EACL) (Findings), 2026
- Eye-Q: A Multilingual Benchmark for Visual Word Puzzle Solving and Image-to-Phrase ReasoningarXiv preprint, 2026
2025
- Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented GenerationIn Meeting of the Association for Computational Linguistics (ACL), 2025
- Fanar: An Arabic-Centric Multimodal Generative AI PlatformarXiv preprint arXiv:2501.13944, 2025
- ChemLM: Domain adaptable language modeling of chemical compounds identifies potent pathoblockers for Pseudomonas aeruginosaCommunications Chemistry, 2025
- CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information RetrievalIn The European Conference on Information Retrieval (ECIR), 2025
- Emo3D: Metric and Benchmarking Dataset for 3D Facial Expression Generation from Emotion DescriptionIn Conference of the North American Chapter of the Association for Computational Linguistics (NAACL) Findings, 2025
- Context-Aware Extraction of Quranic References: A Hybrid Language Model- and Rule-Based ApproachIn Muslims in ML Workshop at NeurIPS, 2025
- GeoPolRAG: Retrieval Augmented Generation for Contextually Grounded QA on Complex Geopolitical MattersIn Muslims in ML Workshop at NeurIPS, 2025
- Generative AI and its Benchmarking for Quranic Question AnsweringIn Muslims in ML Workshop at NeurIPS, 2025
- PahGen: Generating Ancient Pahlavi Text via Grammar-guided Zero-shot TranslationIn LoResMT (Workshop on Low-Resource Machine Translation), 2025
- ParsiPy: NLP Toolkit for Historical Persian Texts in PythonIn Workshop on Ancient Language Processing, 2025
- ImageEval 2025: The First Arabic Image Captioning Shared TaskIn Arabic NLP Conference (Shared Tasks), 2025
2024
- In Proceedings of the Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING), 2024
- Briefings in Bioinformatics, 2024
- TuringQ: Benchmarking AI Comprehension in Theory of ComputationIn Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
- SuperPos-Prompt: Enhancing Soft Prompt Tuning of Language Models with Superposition of Multi Token EmbeddingsIn Efficient Natural Language & Speech Processing at NeurIPS, 2024
- Transformers for Bridging Persian Dialects: Transliteration Model for Tajiki and Iranian ScriptsIn Proceedings of the Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING), 2024
- PatentMorphologically-Aware Tokenizer2024U.S. Provisional Patent Application No. 63/679,403
- AIMA at SemEval-2024 Task 3: Simple Yet Powerful Emotion Cause Pair AnalysisIn Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval), 2024
- M3Face: A Unified Multi-Modal Multilingual Framework for Human Face Generation and EditingarXiv preprint, 2024
2023
- In Proceedings of the Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING), 2023
- XPASC: Measuring Generalization in Weak SupervisionNatural Language Engineering Journal, 2023
- KhabarChin: Automatic Detection of Important News in the Persian LanguagearXiv preprint, 2023
- Sina at SemEval-2023 Task 4: A Class-Token Attention-based Model for Human Value DetectionIn Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval), 2023
- A platform for deep learning on (meta) genomic sequencesPreprint: Europe PMC, 2023
2022
- In Proceedings of TextGraphs-16: Graph-based Methods for Natural Language Processing, 2022
- Bioinformatics Journal, 2022
- Hengam: An Adversarially Trained Transformer for Persian Temporal TaggingIn Proc. of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (AACL), 2022
- Docalog: Multi-document Dialogue System using Transformer-based Span RetrievalIn Proc. of the Second DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering, 2022
2021
- Bioinformatics, 2021
- IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2021
- Empirical Methods in Natural Language Processing (EMNLP), 2021
- PatentMethod, Computer Program and Apparatus for Relating Text Units2021European Patent 20206180.0.6 - 1231, January 2021
2020
- Proceedings of The 12th Language Resources and Evaluation Conference (LREC), 2020
-
- Story Fragment Stitching: The Case of the Story of MosesIn 1st Workshop on Artificial Intelligence for Narratives (AI4N) at the International Conference on Artificial Intelligence (IJCAI), 2020
- Data-driven Variable-length Segmentation of Biological Sequences: Applications in Metagenomics and ProteomicsIn NeurIPS - Computational Biology Workshop, 2020
- PatentMethod, Computer Program and Apparatus for Detecting a Semantic Change of a Word between Domains2020European Patent 20190280.6 - 1231, November 2020
2019
2018
- Biophysical journal, 2018
2017
- In Association for Computational Linguistics, 2017
- In Springer International Publishing, 2017
- In https://ssrn.com/abstract=3029031, 2017
2016
- In Association for Computational Linguistics, 2016
-
- In Association for Computational Linguistics, 2016
2015
- A New Approach for Scalable Analysis of Microbial CommunitiesarXiv preprint, 2015
2014
2013
- In Association for Computational Linguistics, 2013
- In NeurIPS Workshop on Topic Models, 2013
- In Springer Berlin Heidelberg, 2013