Katherine Keith

Katherine A. Keith

  kak5 (at) williams (dot) edu

  curriculum vitae (CV)

I am an Assistant Professor in the Computer Science department at Williams College. Previously, I was a Postdoctoral Young Investigator at the Allen Institute for Artificial Intelligence. I received a PhD from the Manning College of Information and Computer Sciences at the University of Massachusetts Amherst in August 2021, and I am grateful to have been supported by a Bloomberg Data Science PhD Fellowship 2019-2021.

My research is in the domain of social data science, answering questions about human behavior through quantitative analysis of large-scale data. I focus on methods and applications with text data because language is one of the richest and most salient expressions of human thought and behavior. This type of research is closely aligned with the fields of computational social science and text-as-data.

My research expands methods in machine learning and natural language processing to social data science goals including: obtaining quantifiable social measurements from text data, aggregating said measurements in a statistically rigorous manner, and improving causal estimations from text. [Talk bio].

I am interested in projects that build bridges between natural language processing and the social sciences. In line with this goal, Ian Stewart and I organized the NLP+CSS 201 Online Tutorial Series. I also host the podcast Diaries of Social Data Research with Lucy Li in which we probe the “research diaries” of scholars in computational social science and adjacent fields with the hope of normalizing the challenges of and increasing accessibility in academia.

Recent News

  • February 8, 2023 — I gave an invited talk for the Institute for Analytical Sociology's Seminar series entitled "Melding NLP and Causal Inference" [slides] [recording].
  • December, 2023 — I've migrated to Blue Sky.
  • November 10-11, 2023 — I presented our work [slides] and was a discussant [slides] at TADA 2023.
  • October 25, 2023 — I gave an online guest lecture for Dora Demszky's CS 293 / EDUC 473: Empowering Educators via Language Technology at Stanford.
  • September 27, 2023 — Here's an interview with me for E-International Relations discussing computational social science and text-as-data.
  • September 8, 2023— I gave a talk for University of Iowa's Computer Science Colloquium.
  • July 18, 2023 — I gave a talk to the Summer Science Program at Williams College [slides].
  • June 5, 2023 — Congrats to my NLP students David Goetze and Mark Bissell who won the 2023 Rich Ward Prize for Best Student Project in Computer Science for their NLP project "RAGdoll: A Retrieval-Augmented Generation System for Williams College Academic Advising." This prize is given to only one project group across all of Williams computer science for a given academic year.
  • March 20, 2023 — Our paper was mentioned on Strict Scrutiny, a podcast about the Supreme Court.
  • January 6, 2023 — Congrats to Lucy Li, my mentee at AI2 during the summer of 2022, for winning AI2's Outstanding Intern of the Year Award. Check out the pre-print of Lucy's summer internship project here.
  • December 5-12, 2022 — I am traveling to Abu Dhabi, UAE to attend EMNLP! I am one of the organizers of the NLP+CSS Workshop and one of my collaborators is presenting our TACL paper.
  • November 2022 — I am honored to have been awarded a Young Investigator Grant from the Allen Institute of Artificial Intelligence.
  • October 19, 2022 — I was the speaker at the Williams College Statistics Colloquium [slides].
  • October 6, 2022 — I am presenting a poster at TADA 2022.
  • August 30, 2022 — Excited to be on the organizing committee for the NLP+CSS workshop at EMNLP this year. Submit your papers by September 19!
  • August 9, 2022 — I was featured in a blog post by AI2.
  • July 15, 2022 — I started my position at Williams College and have moved to beautiful Williamstown, MA.
  • May 24, 2022 — I presented a poster at the 2022 American Causal Inference Conference (ACIC).
  • March 30, 2022 — I created and presented a tutorial, "Aggregated Classification Pipelines: Propagating Probabilistic Assumptions from Start to Finish," for our NLP+CSS 201 tutorial series.
  • March 28, 2022 — I presented a guest lecture for Diyi Yang's Computational Social Science seminar at Georgia Tech. Slides here.
  • November 12, 2021 — I moderated the panel discussion at the CI+NLP Workshop at EMNLP. Here's some paraphrased highlights from our discussion.
  • November 9, 2021 — Lucy Li and I were interviewed on the radio show "The Graduates" about our podcast, research, and our field more broadly.
  • October 28-29, 2021 — I presented two posters at TADA, held at University of Michigan Ann Arbor.
  • September 24, 2021 — I am co-organizing (with Ian Stewart) NLP+CSS 201: Beyond the basics, an online hands-on tutorial series about advanced methods in natural language processing and computational social science. We are grateful to have received assistance for this series from the Social Science Research Council (SSRC)/Summer Institutes in Computational Social Science (SICSS) Research Grant.
  • August 17, 2021 — I successfully defended my PhD dissertation (recording of the defense)!
  • June 30, 2021 — Our IndiaPoliceEvents corpus was featured in the Data Is Plural newsletter.
  • June 14-18, 2021 — I am participating in the Summer Institute for Computational Social Science this week. Here are slides for the flash talk I gave during the event.
  • June 14, 2021 — Lucy Li and I launched our new podcast, Diaries of Social Data Research, where we probe the “research diaries” of scholars in computational social science and adjacent fields with the hope of normalizing the challenges of and increasing accessibility in academia.
  • January 2021 — I am thankful to have successfully made it through the academic job market! Here is my research statement, teaching statement, and diversity statement, Williams cover letter, and job talk slides. Please reach out if you have questions about the job market process (especially for computer science at liberal arts colleges).
  • December 2020 — I was named an Outstanding Reviewer for EMNLP 2020.
  • November 2020 — I am on the organizing committee for the 1st Workshop on Causal Inference & NLP which will be held at EMNLP 2021.
  • April 2020 — Andy Halterman, Sheikh Sarwar, and I were awarded a $5,000 Kaggle Open Data Research Grant for "Semantic Role Annotations For Real-World Political Texts."
  • May 2020 — Judea Pearl tweeted about our paper!
  • Spring 2020 — I joined and am actively working with the REBLS (Research, Educator, Business Leaders, and Students) Network to increase access and opportunity for underrepresented students in computer science and engineering!
  • January 1, 2020 — It has been my distinct pleasure to serve as Co-Chair of CSWomen, a group of graduate women in computer science here at UMass, for two semesters! Here is a recap of the events we organized last semester.
  • December 6, 2019 — I passed portfolio (the equivalent of a PhD candidacy exam) with distinction.
  • November 15, 2019 — I was the invited speaker at Williams College's Computer Science Colloquium.
  • August 2019 — Honored to have received a Bloomberg Data Science PhD Fellowship. Here's a profile from UMass's Center for Data Science.
  • July 2019 — Really enjoyed mentoring three undergrad researchers this summer. Check out a profile on their experience.
  • September 24, 2018 — I was the invited speaker at Lewis & Clark College's Mathematical Sciences Colloquium.
  • September 2018 — Su Lin Blodgett, Abe Handler, and I co-designed and co-taught Ethical Issues Surrounding Artificial Intelligence Systems and Big Data, a semester-long first-year computer science seminar at UMass.
  • May 2018 — I am interning with Amanda Stent at Bloomberg L.P. in New York this summer.
  • November 2017 — I helped design and organize our college's first Male Ally Workshop for graduate computer science students.
  • Misc.

    In the past, I really enjoyed studying Chinese. I lived for twelve months in Kinmen, Taiwan on a Fulbright English Teaching Assistantship, and I completed a language immersive study abroad program in Beijing, China during a semester in undergrad.

    In my free time, I enjoy recreating outside! I grew up in Montana where learned to love trail-running, sport climbing, triathlons, and all types of skiing, particularly cross-country skate skiing, alpine skiing, and backcountry touring.