Publications and Poster Presentations

Li, W., Yasuda, S., Lake, B. M., & Dillon, M. R. (2023). A Machine Theory of Mind Benchmark Inspired by Infant Cognition. https://doi.org/10.31234/osf.io/zf5nh

Li, W. (2023). Learning to Count through Grokking. Manuscript in Preparation.

Yasuda, S., Li, W., Martinez, D., Lake, B., & Dillon, M. R. (2023). Goal attribution in human

infants and machines. The Curiosity, Creativity, and Complexity Conference, New York, NY.

Yasuda, S., Li, W., Martinez, D., Lake, B., & Dillon, M. R. (2023). 15 month olds’

understanding of social and instrumental imitation. Submitted for Review at the International

Congress of Infant Studies.

Li, W., & Li, Y. (2020). Entropy, mutual information, and systematic measures of structured

spiking neural networks. Journal of Theoretical Biology, 501, 110310. [link to paper]

Li, W., Bakker, M., & Carthew, R. (2019). Bayesian inference of Dpp-regulated gene expression

in drosophila imaginal discs. Poster presented at the Conference for Quantitative Biology,

Northwestern University.

Recent Research Projects

Baby Intuition Benchmark

Li, W., Yasuda, S., Dillon, M. R., & Lake, B. M. (2023). An Infant-Inspired Benchmark for

Machine Social Cognition. Manuscript in Preparation. [paper (in prep)]

The Baby Intuition Benchmark (BIB) presents a suite of theory of mind tasks drawn from infant studies, facilitating comparisons between infants' theory of mind capacities and AI systems. Our new work expands on the existing suite of functions that focus on agency reasoning, introducing new cognitive tasks to challenge machine understandings of more complex causal relations, mental states, and social dynamics between multiple agents. We evaluate the benchmark using a Transformer encoder-decoder trained with a self-supervised learning paradigm. The model shows improved performance over existing baselines, elevating the upper bound of deep learning models' causal and object-oriented reasoning capacities. However, it still demonstrates the limitations of AI in representing other capacities and mental states, underscoring the challenges in achieving a human-like theory of mind reasoning in AI.

Keywords: Common Sense Reasoning; Theory of Mind; Social Cognition; Benchmarks; Artificial Intelligence

Learning to Count through Grokking

When children learn to count, they start by memorizing quantities associated with small sets of objects, and as they develop, they progress to recursive counting for larger sets. This progression exemplifies a classic case of conceptual change in cognitive science, characterized by significant qualitative transformations in conceptual systems. This process resembles 'grokking' in machine learning, where models demonstrate improved out-of-distribution generalization with prolonged training after training convergence. Can insights about grokking shed light on the nature of learning and conceptual change in humans?

This work employs small Transformer models trained on a give-N task, mirroring classic experiments in children's counting. Initially, the models reach rapid convergence within the training data and generalize for about ten times longer after training convergence, achieving over 99% accuracy on counting to unseen quantities. Drawing on the interpretative AI literature on grokking, we aim to uncover a mechanistic explanation of this phenomenon in our model and its possible parallels in child development.

Keywords: Conceptual Change; Counting; Grokking; Concept Learning; Child Development

Applying Information Theory to Spiking Neural Networks

Li W., Li Y. Entropy, mutual information, and systematic measures of structured spiking neural networks. J Theor Biol. 2020 Sep 21;501:110310. doi: 10.1016/j.jtbi.2020.110310. Epub 2020 May 19. PMID: 32416092. [Link to Paper]

This paper investigates various information-theoretic measures, including entropy, mutual information, and some systematic measures based on mutual information, for a class of structured spiking neuronal networks. To analyze and compute these information-theoretic measures for large networks, we coarse-grained the data by ignoring the order of spikes that fall into the same small time bin. The resultant coarse-grained entropy mainly captures the information contained in the rhythm produced by a local population of the network. We first show that these information theoretical measures are well-defined and computable by proving stochastic stability and the law of large numbers. Then, we use three neuronal network examples, from simple to complex, to investigate these information-theoretic measures. Several analytical and computational results about the properties of these information-theoretic measures are given.

Keywords: Complexity; Degeneracy; Entropy; Mutual information; Neural field models.

Other Research Projects

A Reinforcement Learning Model for Fear Reconsolidation and Extinction During Dreaming

Final Project of Computational Cognitive Modeling course (May 2021)
Why do we have bad dreams? How do dreams influence our lives during the day? We use a reinforcement learning agent to demonstrate how dreaming, as an analogy to experience replay, may help achieve fear reconsolidation and extinction. We hypothesize that mismatches of memory units in experience replay facilitate regulations of fear memory. Using the video game Pac-Man, our experiments show that fear regulation in dreams is possible, and the brain could switch between fear reconsolidation and extinction in response to environmental changes via a dynamically updated mismatch probability score. [Link to report]

Application of Self-Supervised Learning to an Image Classification Task with Scarce Labels First Place in Prof. Yann LeCun's Self-Supervised Learning Competition (May 2021)
We devised an approach utilizing unsupervised learning, pseudo-label iterations, semi-supervised learning, and active learning methods. The final result achieves 55.80% accuracy on the test dataset with 0.5% labeled data. [Link to report]

Bayesian Modeling of Dpp regulations in Fruit Flies NSF-Simons Center for Quantitative Biology, Northwestern University (Jun. – Aug. 2019)
It is important for genes to be expressed in certain spatiotemporal order during development. The levels of mRNA are important parameters of gene expression. In this project, we study the regulations of Dpp, a kind of signaling molecule in the imaginal disc of adult Drosophila wings, on its downstream genes, such as salm and brk. We investigate this by modeling the kinetic parameters of mRNA transcriptions corresponding to the concentration of Dpp. To do this, we utilize Bayesian inference and Markov Chain Monte Carlo (MCMC) sampling tools. This method has been tested on synthetic data, and it made a prediction with an error rate under 8%. We have applied this method to brk and salm gene. Learn more about this project on GitHub

A Quantum Markov Chains Model of DNA TranscriptionsFreiwald Scholar Research Program, Washington University in St. Louis (Jan. – May 2019)
This project investigates the use of both classical and quantum information theory to describe genome preservation and biological evolution. This write-up focuses on the research of Djordijevic, who attempts to explain the transfer of genetic information from DNA to protein in a Markovian-like quantum biological model. Yockey proposes the classical counterpart of this model.