My research interest is in the interplay of Cognitive Science and AI. I want to use machine learning models to understand core cognition. As a secondary goal, I hope to use this understanding to inspire more human-like Artificial Intelligence systems.
I am an assistant research scientist at NYU working with Dr. Brenden Lake and Dr. Moira Dillon. Prior to this, I worked with Dr. Guangyu Robert Yang at MIT. I obtained my Master's in Computer Science from NYU and a Bachelor of Arts in Mathematics and Philosophy from Washington University in St. Louis.
Outside of science, I am interested in the philosophy of mind and language. I am an Argentine Tango dancer, photographer, botanist, and flutist.
I am always happy to connect with like-minded people. You can reach out to me at wenjieli@nyu.edu.
Research Keywords: core cognition, concept learning, intuitive theory, representation learning, deep learning, probabilistic modeling
Email / Twitter / GitHub / Google Scholar
Li, W., Yasuda, S., Lake, B. M., & Dillon, M. R. (2023). A Machine Theory of Mind Benchmark Inspired by Infant Cognition. https://doi.org/10.31234/osf.io/zf5nh
The Baby Intuition Benchmark (BIB) presents a suite of theory of mind tasks drawn from infant studies, facilitating comparisons between infants' theory of mind capacities and AI systems. Our new work expands on the existing suite of functions that focus on agency reasoning, introducing new cognitive tasks to challenge machine understandings of more complex causal relations, mental states, and social dynamics between multiple agents. We evaluate the benchmark using a Transformer encoder-decoder trained with a self-supervised learning paradigm. The model shows improved performance over existing baselines, elevating the upper bound of deep learning models' causal and object-oriented reasoning capacities. However, it still demonstrates the limitations of AI in representing other capacities and mental states, underscoring the challenges in achieving a human-like theory of mind reasoning in AI.
Keywords: Common Sense Reasoning; Theory of Mind; Social Cognition; Benchmarks; Artificial Intelligence
When children learn to count, they start by memorizing quantities associated with small sets of objects, and as they develop, they progress to recursive counting for larger sets. This progression exemplifies a classic case of conceptual change in cognitive science, characterized by significant qualitative transformations in conceptual systems. This process resembles 'grokking' in machine learning, where models demonstrate improved out-of-distribution generalization with prolonged training after training loss plateaus. Can insights about grokking shed light on the nature of learning and conceptual change in humans?
This work employs small Transformer models trained on a give-N task, mirroring classic experiments in children's counting. Initially, the models rapidly converge within the training data. After prolonged training, the models generalize to unseen quantities, achieving over 99% accuracies. Drawing on the interpretative AI literature on grokking, we aim to uncover a mechanistic explanation of this phenomenon in our model and its possible parallels in child development.
Keywords: Conceptual Change; Counting; Grokking; Concept Learning; Child Development
Li W., Li Y. Entropy, mutual information, and systematic measures of structured spiking neural networks. J Theor Biol. 2020 Sep 21;501:110310. doi: 10.1016/j.jtbi.2020.110310. Epub 2020 May 19. PMID: 32416092. [Link to Paper]
This paper investigates various information-theoretic measures, including entropy, mutual information, and some systematic measures based on mutual information, for a class of structured spiking neuronal networks. To analyze and compute these information-theoretic measures for large networks, we coarse-grained the data by ignoring the order of spikes that fall into the same small time bin. The resultant coarse-grained entropy mainly captures the information contained in the rhythm produced by a local population of the network. We first show that these information theoretical measures are well-defined and computable by proving stochastic stability and the law of large numbers. Then, we use three neuronal network examples, from simple to complex, to investigate these information-theoretic measures. Several analytical and computational results about the properties of these information-theoretic measures are given.
Keywords: Complexity; Degeneracy; Entropy; Mutual information; Neural field models.
Copyright © 2024 Wenjie Li - All Rights Reserved.