Data Scientist @ Dell
IR, Representation Learning, DL, LLM
Email: sumantakashyapi [at] gmail [dot] com
Visit My WorksSumanta Kashyapi is a senior data scientist at Dell Technologies where he is part of the predictive analytics team. Before joining Dell, he received his PhD in Computer Science from The University of New Hampshire, advised by Prof. Laura Dietz and Masters from The National Institute of Technology Hamirpur, advised by Prof. Madhu Kumari. His main research interests are in the intersection of Representation Learning and Information Retrieval (IR). During his PhD, Sumanta specifically focused on representation learning suitable for clustering short snippets of texts and investigated how it can be leveraged for various IR tasks. His recent works revolve around incorporating complex structures found in data into learned representations with a specific downstream task in mind and doing it efficiently.
JCDL 2022 - Best student paper nominee
Can we improve search result clustering (SRC) by directly involving the query context into the trained similarity metric used for clustering? To investigate this, we propose Query-Specific Siamese Similarity Metric (QS3M) for query-specific clustering of text documents
Read moreCan we generalize contrastive learning for clustering tasks by directly optimizing for a clustering quality metric like RAND index? We present Clustering Optimization as Blackbox (COB) that employs a recent optimization technique suitable for discrete metrics and show that it leads to better representations suitable for clustering.
Read moreDense Passage Retrieval (DPR) relies on the underlying embedding space to find relevant documents in response to a query. In this work, we explore whether the embedding model trained with an auxiliary clustering objective improves the retrieval quality of a DPR system.
Read more