In the search of an answer to this question, testing AI systems against real threats, "Backdoor detection in LLMs, adversarial attacks on vision-language models, and algorithmic auditing."
I'm Aditya Bansal, a CS undergrad at BITS Pilani working on AI control & model evaluations and cross-architecture threat modeling. Currently building towards predoc and research fellowship roles before a PhD.
Adversarial ML and AI safety research.
Dark pattern detection in ride-hailing applications.
Early-stage product engineering.
I'm starting a Substack where I write about AI safety research, paper breakdowns, technical commentary on my projects and random life lessons.
Subscribe when it's liveAlways open to discussing research, collaboration, or new ideas.