Marzena Karpinska

prospective students

I am looking to hire 1-2 students starting Fall 2026. If you are interested, you should apply via this page by January 19th. Feel free to also reach out to me via email (note: I may not be able to reply to all emails, but I AM reading them all). Students from underrepresented groups are encouraged to apply!

about me

I am currently an assistant professor at the Simon Fraser University in beautiful Vancouver, Canada. Before that, I was a senior researcher at Microsoft based in Redmond. I did my postdoc at the Manning College of Information & Computer Sciences, University of Massachusetts Amherst working with Prof. Mohit Iyyer.

I hold a Ph.D. from the Department of Language and Information Sciences at the University of Tokyo.

research

I am interested in how well natural language processing (NLP) systems handle long-form content, both as input and output. My work includes areas like machine translation of creative texts, story generation, summarizing long texts, verifying claims about book-length content, and multilingual long-form question answering.

media

PessGazette Study claims 9% of US newspaper articles at least partly AI generated
FORBES Study Shows Experienced Humans Can Spot Text Created By AI
THE ECONOMIST GPT, Claude, Llama? How to tell which AI model is best
TECH CRUNCH Gemini's data-analyzing abilities aren't as good as Google claims

news

Jan 2026 Started as an assistant professor at Simon Fraser University in Vancouver, Canada.
Oct 2025 New paper on AI-generated text in US news articles is out
Aug 2025 Papers on interdisciplinary approach to MT, cross-lingual memorization, and quantization effects on long-context tasks accepted to EMNLP 2025 🎉
Jul 2025 Our work on multilingual long-context processing was accepted to COLM 🎉
Jun 2025 Preprint on interdisciplinary approach to machine translation is out
Jun 2025 Serving as a Senior Area Chair at EMNLP
May 2025 Our preprint on cross-lingual memorization is out
May 2025 Our preprint on the effect of quantization on long-context tasks is out
May 2025 Our works on AI-generated text and multilingual long-form QA were accepted to ACL 🎉
Mar 2025 Our preprint on multilingual long-context processing is out
Jan 2025 Our preprint on machine generated text detection is out
Jan 2025 New preprint on slow-down attacks on reasoning models is out
Nov 2024 Recognized as an outstanding AC at EMNLP 2024
Sep 2024 NoCha was accepted to EMNLP 2024 🎉
Sep 2024 ESA was accepted to WMT 2024 🎉
Aug 2024 Presented our work on the evaluation of long-context language models at UNSW
Jul 2024 Presented our work on the evaluation of long-context language models at RMIT
Jul 2024 Presented our work on the evaluation of long-context language models at the University of Melbourne
Jul 2024 FABLES was accepted to COLM 2024 🎉
Jun 2024 Our preprint on LONG-CONTEXT processing capabilities of language models is out
Jun 2024 Our preprint on MULTI-LINGUAL/CULTURAL performance of language models is out
Jun 2024 Our preprint on more robust evaluation for machine translation is out
Apr 2024 Our preprint on faithfulness in book-length summaries is out
Mar 2024 NarrativeTime was accepted to LREC-COLING 2024 🎉
Dec 2023 Presented our work at WMT in Singapore.
Nov 2023 Launched litmt.org, a platform for sharing machine-translated world literature.
May 2023 Virtual talk at Instituto Superior Técnico & Unbabel Seminar on translation with Large Language Models.
Apr 2023 Virtual talk at Microsoft MT Reading Group on translation with Large Language Models.
Jan 2023 Virtual talk at Polish Academy of Sciences on Evaluation of Long-form Text Generation.
Dec 2022 Presented our work on diagnosing automatic evaluation metrics at EMNLP in Abu Dhabi.
Dec 2022 Presented our work on document-level MT at EMNLP in Abu Dhabi.