Hanlin Zhang
Why do you care about AI Existential Safety?
Modern machine learning models are trained on vast amounts of diverse data. Such models learned via self-supervision on pre-text tasks can be performant for a broad range of downstream tasks. Yet, it also poses great challenges in trustworthiness that spans robustness, privacy, fairness, calibration, and interpretability. My work studies those concerns and proposes effective solutions to ensure safe model deployment that involves consequential decision making.
Please give at least one example of your research interests related to AI existential safety:
Trustworthy ML in the wild: How can we identify problematic behaviors of ML models in consequential decision-making and develop algorithmic tools to mitigate them?
Understanding and improving learning through reasoning: How can we leverage language to imbue useful inductive biases for reasoning to make progress on important trustworthy problems?