Lab 214

140 Governors Dr

Amherst, MA


My name is Ke Yang (杨珂). I’m a postdoctoral researcher at College of Information and Computer Science, University of Massachusetts, Amherst. I obtained my Ph.D. from the Tandon School of Engineering, New York University, under the supervision of Prof. Julia Stoyanovich.

My primary area of research centers around data management, especially about ethical concerns including fairness, transparency, explainability, and the social impact of the algorithms in data-driven systems.

I am interested in designing algorithms for mitigating undesirable outcomes from automatic decision-making processes and developing tools to apply these algorithms in different applications.

Professional Experience

  • Postdoctoral Research Associate, College of Information and Computer Sciences, University of Massachusetts, Amherst, 2021.09 ~ now
    • Supervisor: Alexandra Meliou
    • Project: Interpretable Machine Learning with data-focused explanations.
  • Research Assistant, Tandon School of Engineering, New York University, 2019 ~ 2021
  • Research intern, AT&T Lab Research, New York, 2019 Summer
    • Supervisors: Emily Dodwell, Ritwik Mitra, and Balachander Krishnamurthy
    • Worked on a project to insure fairness and transparency for internal AT&T machine learning projects. This project aims to ensure that the projects’ outcome meets reasonable guidelines for fairness without illegal bias and discrimination. We developed a diagnostic tool for bias-related issues that can be used by project managers and data scientists in AT&T.
  • Research Assistant, College of Computing & Informatics, Drexel University, 2015 ~ 2018
    • Supervisor: Julia Stoyanovich
    • Worked on a project to quantify fairness in rankings through equalized representation across groups, and we proposed a mitigation framework to ensure a fair ranking outcome.
  • Research engineer, Elite & Resource (start-up company), Beijing, 2014 ~ 2015
    • Supervisor: Peng Sun
    • Worked on a project to model historical data of flood disaster and develop techniques to help with flood prevention. In this project, we proposed a model to predict the probability of future flood using water flow rate of small watershed torrents, which is recognized as a significant signal of potential flood. Our model is integrated as a core component of national watershed data management system.

Open Source Tools

  • Mirror Data Generator
    • A python script generates synthetic data to mirror issues, such as sampling and societal bias. The issues are described by the correlation between features.
  • Ranking Facts
    • A web-based tool generates a ``nutritional label’’ for rankings. Each label shows a fact about the ranking. For example, a fact about fairness explains whether the ranking shows statistical parity between groups that are defined by a user-specified feature.
  • FairDAGs
    • A web-based tool extracts directed acyclic graph (DAG) representation of data science pipelines and tracks the changes of the distributions of targets and groups due to each operation. The groups are often defined by a user-specified feature in the dataset.

Last Updated at 09/03/2021