Publications.
On this page, you will find a list of my publications, along with their abstracts.
Observable Propagation: A Data-Efficient Approach to Uncover Feature Vectors in Transformers
Dunefsky, J. and Cohan, A., 2023. Observable Propagation: A Data-Efficient Approach to Uncover Feature Vectors in Transformers. arXiv preprint arXiv:2312.16291.
A key goal of current mechanistic interpretability research in NLP is to find linear features (also called "feature vectors") for transformers: directions in activation space corresponding to concepts that are used by a given model in its computation. Present state-of-the-art methods for finding linear features require large amounts of labelled data -- both laborious to acquire and computationally expensive to utilize. In this work, we introduce a novel method, called "observable propagation" (in short: ObsProp), for finding linear features used by transformer language models in computing a given task -- using almost no data. Our paradigm centers on the concept of observables, linear functionals corresponding to given tasks. We then introduce a mathematical theory for the analysis of feature vectors: we provide theoretical motivation for why LayerNorm nonlinearities do not affect the direction of feature vectors; we also introduce a similarity metric between feature vectors called the coupling coefficient which estimates the degree to which one feature's output correlates with another's. We use ObsProp to perform extensive qualitative investigations into several tasks, including gendered occupational bias, political party prediction, and programming language detection. Our results suggest that ObsProp surpasses traditional approaches for finding feature vectors in the low-data regime, and that ObsProp can be used to better understand the mechanisms responsible for bias in large language models. Code for experiments can be found at this http URL.
Transport control networking: optimizing efficiency and control of data transport for data-intensive networks
Dunefsky, J.(co-first-author), Soleimani, M., Yang, R., Ros-Giralt, J., Lassnig, M., Monga, I., Wuerthwein, F.K., Zhang, J., Gao, K. and Yang, Y.R., 2022, August. Transport control networking: optimizing efficiency and control of data transport for data-intensive networks. In Proceedings of the ACM SIGCOMM Workshop on Network-Application Integration (pp. 60-66).
Data-intensive sciences are becoming increasingly important for modern sciences. The transport control plane (TC-Plane) of the networks supporting data-intensive sciences can be important to achieve efficient and controlled transport of data for data-intensive sciences. In this paper, we analyze FTS, which is the de facto TC-Plane of the largest data-intensive network, revealing both efficiency and resource control issues of the current design. We then present the design and initial evaluation of Transport Control Networking (TCN), a design that is based on FTS but introduces (1) network-application co-design/coordination, which uses ALTO to realize network-wide resource control, and (2) a general, efficient, flexible optimization framework for TC-Plane, which allows both zero-order and first-order (e.g., bottleneck structure) gradient-based algorithms. We also discuss future work to engage the broad networking and data-intensive sciences communities.
Analysis of neural clusters due to deep brain stimulation pulses
Kuelbs, D., Dunefsky, J. (co-first-author), Monga, B. and Moehlis, J., 2020. Biological Cybernetics, 114(6), pp.589-607.
Deep brain stimulation (DBS) is an established method for treating pathological conditions such as Parkinson’s disease, dystonia, Tourette syndrome, and essential tremor. While the precise mechanisms which underly the effectiveness of DBS are not fully understood, several theoretical studies of populations of neural oscillators stimulated by periodic pulses have suggested that this may be related to clustering, in which subpopulations of the neurons are synchronized, but the subpopulations are desynchronized with respect to each other. The details of the clustering behavior depend on the frequency and amplitude of the stimulation in a complicated way. In the present study, we investigate how the number of clusters and their stability properties, bifurcations, and basins of attraction can be understood in terms of one-dimensional maps defined on the circle. Moreover, we generalize this analysis to stimuli that consist of pulses with alternating properties, which provide additional degrees of freedom in the design of DBS stimuli. Our results illustrate how the complicated properties of clustering behavior for periodically forced neural oscillator populations can be understood in terms of a much simpler dynamical system.
An Emotion Regulation Tablet App for Middle-Aged and Older Adults at High Suicide Risk: Feasibility, Acceptability, and Two Case Studies
Kiosses, D.N., Monkovic, J., Stern, A., Czaja, S.J., Alexopoulos, G., Arslanoglou, E., Ebo, T., Pantelides, J., Yu, H., Dunefsky, J., Smeragliuolo, A., Putrino, D. 2021. The American Journal of Geriatric Psychiatry.
Objective
The unique features of technological applications may improve the treatment of people at risk of suicide. In this article, we present feasibility and acceptability data as well as two case studies demonstrating the use of WellPATH, a tablet app that aims to help suicidal patients during emotionally-charged situations outside of therapy sessions. The WellPATH app was part of a 12-week psychotherapy intervention (CRISP – Cognitive Reappraisal Intervention for Suicide Prevention) for middle-aged and older adults after their discharge from a suicide-related hospitalization.
Design
The use of WellPATH includes three stages: preparation and practice, incorporation, and actual use.
Measurements
Feasibility was measured by the overall use of WellPATH during 12 weeks, and acceptability was measured with the three items of the Client Satisfaction Questionnaire.
Results
Twelve study participants were administered WellPATH as part of CRISP. The results provide preliminary evidence of feasibility and acceptability of WellPATH. Study participants and therapists reported high satisfaction with WellPATH and provided feedback for future research and development. The patients in the case studies reported a reduction in negative emotions and an increase in emotion regulation (i.e., cognitive reappraisal ability) after using techniques on the WellPATH app.
Conclusion
Our preliminary findings suggest that use of technology applications such as the WellPATH app is feasible and accepted among middle-aged and older adults at high suicide risk. Further research with an adequately powered sample is needed to further evaluate WellPATH's feasibility and accessibility, and test its efficacy with this high-risk population.