Shubhendu Trivedi

(Still avoiding mugshots!)

About me, research interests, background etc.

Update: From October 2018, I am an Institute Fellow at the Institute for Computational and Experimental Research in Mathematics at Brown University. I am also a research affiliate with the Computer Science and Artificial Intelligence Laboratory at MIT, attached with the group of Prof. Regina Barzilay, who is also my mentor for the fellowship. I'll also be a researcher-in-residence for ICERM's Computer Vision program from Feb-May 2019.

I defended my PhD thesis on 16 August 2018 (PDF forthcoming) and formally completed all the requirements for the PhD, including submission of the final thesis document, on 31 August 2018; my dissertation committee comprised of Kevin Gimpel, Risi Kondor, Brian D. Nord and Gregory Shakhnarovich (Thesis supervisor). Here is a post-defense picture with the committee along with an honorary committee member, till I update this website and also set-up a new one.

I am a PhD candidate with broad interests in Machine Learning. In particular, I have a predilection for (deep and otherwise) representation learning, structured prediction and general semi/weakly supervised learning. Currently, I have been exploring problems in the supervised learning of similarity and distance in low-shot regimes, as well as learning representations for combinatorial structures such as graphs and sets. I am interested in and seek inspiration from applications of machine learning in computer vision, and more recently, the physical sciences, especially in computational chemistry and physics. I also maintain an amateur interest in combinatorics and spectral graph theory.

For my research I consider myself very fortunate to be working under the supervision of Prof. Gregory Shakhnarovich at the Toyota Technological Institute at Chicago. I also work very closely with Prof. Risi Kondor at the Departments of Statistics and Computer Science at The University of Chicago and Dr. Brian D. Nord at the Kavli Institute for Cosmological Physics and Fermilab (Group: Deep Skies Lab). During the course of my PhD, I also have had the unusual and enriching experience of getting to design, prepare and teach a large graduate course in deep learning (under the tutelage of and with Prof. Kondor -- also read this Symmetry magazine article that mentions our class). My most recent industrial research internship was at NEC Labs America, where I was mentored by Dr. Ryohei Fujimaki, for work on robust optimization.

+ Some background:
Prior to candidacy, I completed a MS (focusing on Machine Learning). Before that, in what now seems like a past life, I worked on problems in educational analytics, clustering and ensemble learning under the supervision of Professors Neil T. Heffernan and Gábor N. Sárközy earning another MS (in Computer Science, here's the proof!) with a thesis (Prof. Sonia Chernova was the reader) that presented a new clustering algorithm based on Szemerédi Regularity Lemma and also a method somewhat similar to mixture of experts using clustering for ensemble learning. Further afield, I worked in the industry in the signal processing domain (Application Specific Integrated Circuits) for roughly one year after acquiring an undergraduate degree in Electronics and Communications Engineering. While working I also helped my undergraduate advisor, Dr. (Mrs) K. R. Joshi, in teaching three senior year courses. During my undergrad, I worked on biometrics (face and speech recognition - using subspace projection methods for the former and dynamic programming for the latter). At the same time I also worked on blind source separation with applications to magnetic resonance image denoising.


This website is getting updated, please check back later

Research reports

  • DeepCMB: Lensing Reconstruction of the Cosmic Microwave Background with Deep Neural Networks
    Joao Caldeira, W. L. Kimmy Wu, Brian D. Nord, Camille Avestruz, Shubhendu Trivedi and Kyle Story.
    Submitted. 2018
    arXiv preprint arXiv:1810.01483

  • Discriminative Learning of Similarity and Group-Equivariant Representations
    Shubhendu Trivedi.
    PhD Thesis. 2018
    arXiv preprint arXiv:1808.10078

  • Clebsch-Gordan Networks: A Fully Fourier Space Spherical Convolutional Neural Network
    Risi Kondor, Zhen Lin and Shubhendu Trivedi.
    Neural Information Processing Systems (NIPS) 2018, Montreal, Canada.
    arXiv preprint arXiv:1806.09231 (PDF)
    [PyTorch Code]
    denotes alphabetical author ordering

  • On the Generalization of Equivariance and Convolution in Neural Networks to the Action of Compact Groups
    Risi Kondor and Shubhendu Trivedi.
    International Conference on Machine Learning (ICML) 2018, Stockholm, Sweden
    arXiv preprint arXiv:1802.03690 (PDF)

  • Predicting Molecular Properties with Covariant Compositional Networks
    Hy Truong Son, Shubhendu Trivedi, Horace Pan, Brandon M. Anderson and Risi Kondor.
    The Journal of Chemical Physics (JCP) 148, 241745, American Institute of Physics Publishing, 2018

  • Covariant Compositional Networks for Learning Graphs
    Risi Kondor, Hy Truong Son, Horace Pan, Brandon M. Anderson and Shubhendu Trivedi.
    International Conference on Learning Representations (ICLR) 2018 - WS Track, Vancouver, Canada
    [PyTorch Code]
    arXiv preprint arXiv:1801.02144 (PDF)

  • Identication and measurement of galaxy cluster properties in millimeter wave maps using deep learning
    W. L. Kimmy Wu, Brian D. Nord and Shubhendu Trivedi.

  • Cross-Encoders: Learning Physics from Images
    Joao Caldeira, W. L. Kimmy Wu, Camille Avestruz, Brian D. Nord, Shubhendu Trivedi and Kyle Story.

  • The Jacobian Outerproduct
    Shubhendu Trivedi and Jialei Wang.
    denotes alphabetical author ordering

  • Discriminative Metric Learning by Neighborhood Gerrymandering
    Shubhendu Trivedi, David McAllester, Gregory Shakhnarovich.
    Neural Information Processing Systems (NIPS) 2014, Montreal, Canada.

  • A Consistent Estimator of the Expected Gradient Outerproduct
    Shubhendu Trivedi, Jialei Wang, Samory Kpotufe, Gregory Shakhnarovich.
    Uncertainity in Artificial Intelligence (UAI) 2014, Quebec City, Canada.
    denotes equal contribution

  • Applying Clustering to the Problem of Predicting Retention within an ITS: Comparing Regularity Clustering with Traditional Methods.
    Fei Song, Shubhendu Trivedi, Yu Tao Wang, Gábor N. Sárközy, Neil T. Heffernan.
    AAAI FLAIRS 2013, St. Pete Beach, FL, United States. (older version)

  • A Practical Regularity Partitioning Algorithm and its Applications in Clustering
    Gábor N. Sárközy, Fei Song, Endre Szemerédi, Shubhendu Trivedi.
    arXiv preprint arXiv:1209.6540, 2012
    denotes alphabetical author ordering

  • The real world significance of performance prediction
    Zachary A. Pardos, Qing Yang Wang, Shubhendu Trivedi.
    Educational Data Mining (EDM) 2012, Chania, Greece

  • Co-Clustering by Bipartite Spectral Graph Partitioning for Out-of-Tutor Prediction
    Shubhendu Trivedi, Zachary A. Pardos, Gábor N. Sárközy, Neil T. Heffernan.
    Educational Data Mining (EDM) 2012, Chania, Greece

  • Clustered Knowledge Tracing
    Zachary A. Pardos, Shubhendu Trivedi, Neil T. Heffernan, Gábor N. Sárközy.
    Intelligent Tutoring Systems (ITS) 2012, Chania, Greece

  • Spectral Clustering in Educational Data Mining
    Shubhendu Trivedi, Zachary A. Pardos, Gábor N. Sárközy, Neil T. Heffernan.
    Educational Data Mining (EDM) 2011, Eindhoven, Netherlands

  • Clustering students to generate an ensemble to improve standard test score predictions
    Shubhendu Trivedi, Zachary A. Pardos, Neil T. Heffernan.
    Artificial Intelligence in Education (AIEd) 2011, Auckland, New Zealand

    Notes/Unpublished Works/Theses

  • Slides : An introduction to Koopman Operators

  • Notes on Asymmetric Metric Learning for kNN Classification
    Shubhendu Trivedi.
    Notes, November 2015
    Working document, PDF

  • The Utility of Clustering in Prediction Tasks
    Shubhendu Trivedi, Zachary A. Pardos, Neil T. Heffernan.
    Unpublished Technical Report, 05 September 2011
    arXiv version: arXiv 1509.06163
    (An early, mostly experimental project report that investigates how to leverage clustering to improve prediction)

  • A Graph-Theoretic Clustering Algorithm based on the Regularity Lemma and Strategies to Exploit Clustering for Prediction
    Shubhendu Trivedi.
    MS Thesis, 2012


  • A Fully Fourier Space Spherical Convolutional Neural Network based on Clebsch-Gordan Transforms (Provisional US patent application; with R. Kondor and Z. Lin.)

    Current collaborative projects and interests

  • Deep learning over point clouds and sets
  • Deep equivariant networks
  • Understanding the structure and dynamics of supercooled liquids and glasses using machine learning
  • Deep learning for detecting strong gravitational lensing
  • Low shot learning for combinatorial data


    I have taught undergraduate and graduate courses at various points and served as teaching assistant for about a dozen CS/Math/EE courses. Once in a while I have won awards for the same, the most recent being the best TA award in the CS department of The University of Chicago and getting a commendation from the physical sciences division.

    As Instructor/Co-Instructor:

    Graduate Course (University of Chicago, CS)
    -- Deep Learning (CMSC 35246, Textbook: Bengio, Goodfellow, Courville; Course website; Jointly taught with Prof. Risi Kondor)
    Undergraduate Courses (University of Pune, EE):
    -- Introduction to Digital Image Processing (Textbook: Gonzalez and Woods; Jointly taught with Prof. K. R. Joshi)
    -- Image and Signal Processing Lab
    -- Introduction to Bioinformatics (mostly covered the part on data mining)

    As Teaching Assistant:

    Graduate Courses:
    -- CS 534 Artificial Intelligence (Instructor: Dr. Neil T. Heffernan, Textbook: Russell and Norvig)
    -- TTIC 31020 Introduction to Statistical Machine Learning (Instructor: Dr. Gregory Shakhnarovich)
    Undergraduate Courses:
    -- CS 4120 Analysis of Algorithms (Instructor: Dr. Gábor N. Sárközy, Textbook: CLRS/Kleinberg-Tardos)
    -- CS 2223 Introduction to Algorithms wih Lua (Instructor: Dr. Joshua D. Guttman, Textbook: CLRS)
    -- CS 3133 Foundations of Computer Science i.e Automata Theory (Instructor: Dr. Gábor N. Sárközy, Textbook: Sudkamp)
    -- CS 4341 Introduction to Artificial Intelligence (Instructor: Dr. Neil T. Heffernan, Textbook: Russell and Norvig)
    -- MA 2201 Discrete Mathematics (Instructor: Dr. Gábor N. Sárközy, Textbook: Kenneth Rosen)
    -- CS 2223 Introduction to Algorithms wih Lua (Instructor: Dr. Joshua D. Guttman, Textbook: CLRS)
    -- CS 3133 Foundations of Computer Science i.e Automata Theory (Instructor: Dr. Gábor N. Sárközy, Textbook: Sudkamp, Dexter Kozen)
    -- CS 2011 Introduction to Machine Organization and Assembly Language (Instructor: Dr. Hugh C. Lauer, Textbook: Bryant and Halloran)
    -- STAT 27725/CMSC 25400 Machine Learning (Instructor: Dr. Imre Risi Kondor)
    (Slides from some lectures I gave in this course:
    Discrete Probability Tutorial | Maximum Likelihood Estimation and Multivariate Gaussians
    Artificial Neural Networks I | Artificial Neural Networks II)

    Selected Courses

    Graduate Courses:
    Introduction to Statistical Machine Learning, Mathematical Foundations (type theory), Metric Geometry, Algorithms, Discrete Mathematics, Information Theory, Signals, systems and random processes, Speech technologies, Non-linear dynamical systems and chaos, Computability and complexity theory, Intelligent tutoring systems, Artificial Intelligence (with LISP), Automata Theory (Foundations of Computer Science), Numerical Linear Algebra, Combinatorics, Knowledge discovery and data mining, Logic in computer science etc.

    Undergraduate Courses:
    Very Large Scale Integration, Computer and voice networks, Optical and Microwave communication, Image processing, signal processing, Computer Architecture, Analog and Digital Communication, Advanced Microprocessors, Coding Theory, Power Electronics, Mechatronics, Electromagnetic fields, Network theory, Linear and non-linear control theory, Vector calculus, Real analysis, Abstract algebra, Ordinary differential equations, Elementary differential geometry etc.

    Service etc.

    IEEE (2007 - ), ACM (2010 - )

    Neural Information Processing Systems (NIPS), User Modeling, Adaptation, and Personalization (UMAP), IEEE Transactions on Neural Networks and Learning Systems (IEEE TNNLS), IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI), International Conference on Machine Learning (ICML), etc.


    My Erdős Number is 2*. My Bacon Number is ∞. I don't eat Bacon.
    *Paths (listing on the Erdős number project):
    1. Shubhendu Trivedi (2011) ← Gábor N. Sárközy (1997) ← Paul Erdős (1932)
    2. Shubhendu Trivedi (2012) ← Endre Szemerédi (1966) ← Paul Erdős (1932)

    Elsewhere on the Internet:

    -- Google Scholar
    -- Onionesque Reality (a dormant blog, mostly on random things)
    -- Goodreads (again, not too frequently updated, it is hard to catch up with my own reading speed ;)
    -- Twitter (mostly ML related)