Biography

Prior physicist and financer, now a Ph.D. student at UC San Diego researching deep learning and artificial intelligence with coadvisors David Meyer and Gary Cottrell . I graduated from MIT in physics and researched techniques for directional dark matter detection with the DMTPC collaboration . After a two year stint in finance at Fidelity, I came back to graduate school to focus on data analysis and deep learning across a variety of tasks.

Summer 2015
Zillow Group Internship
Deep CNNs and LSTMs for image quality detection.
2015-
Gary Cottrell GURU Group
Deep learning.
2014-
David Meyer Group
Topological Data Analysis.
2013-2014
CMS Group at CERN
Classification algorithms to improve efficiency of track reconstruction.
2013-
UC San Diego Ph.D. student
Machine Learning focusing on deep learning with CNNs and RNNs.
2010-2013
Fidelity Management and Research Company
Fundamental and Quantitative Equity Investments.
2008-2009
Cambridge University
2006-2010
Massachusetts Institute of Technology: Bachelor's Degree
Major in Physics (8A for you MIT nerds).

Research

Persistent Homology for Mobile Phone Data Analysis
Topological data analysis is a new approach to analyzing the structure of high dimensional datasets. Persistent homology, specifically, generalizes hierarchical clustering methods to identify significant higher dimensional properties. In this project, we analyze mobile network data from Senegal to determine whether significant topological structure is present. We investigate two independent questions: whether the introduction of the Dakar motorway has any significant impact on the topological structure of the data, and how communities can be constructed using this method. We consider three independent metrics to compute the persistent homology. In two of these metrics, we see no topological change in the data given the introduction of the motorway; in the remaining metric, we see a possible indication of topological change. The behavior of clustering using the persistent homology calculation is sensitive to the choice of metric, and is similar in one case to the communities computed using modularity maximization.
W. Fedus et al
Background Rejection in the DMTPC Dark Matter Search Using Charge Signals
The Dark Matter Time Projection Chamber (DMTPC) collaboration is developing low-pressure gas TPC detectors for measuring WIMP-nucleon interactions. Optical readout with CCD cameras allows for the detection for the daily modulation in the direction of the dark matter wind, while several charge readout channels allow for the measurement of additional recoil properties. In this article, we show that the addition of the charge readout analysis to the CCD allows us too obtain a statistics-limited 90% C.L. upper limit on the e− rejection factor of 5.6 × 10−6 for recoils with energies between 40 and 200 keVee. In addition, requiring coincidence between charge signals and light in the CCD reduces CCD-specific backgrounds by more than two orders of magnitude.
J. Lopez et al
Arxiv
DMTPC: Dark matter detection with directional sensitivity.
The Dark Matter Time Projection Chamber (DMTPC) experiment uses CF_4 gas at low pressure (0.1 atm) to search for the directional signature of Galactic WIMP dark matter. We describe the DMTPC apparatus and summarize recent results from a 35.7 g-day exposure surface run at MIT. After nuclear recoil cuts are applied to the data, we find 105 candidate events in the energy range 80 - 200 keV, which is consistent with the expected cosmogenic neutron background. Using this data, we obtain a limit on the spin-dependent WIMP-proton cross-section of 2.0 \times 10^{-33} cm^2 at a WIMP mass of 115 GeV/c^2. This detector is currently deployed underground at the Waste Isolation Pilot Plant in New Mexico.
J. Battat et al
Proceedings of Science
First Dark Matter Search Results from a Surface Run of the 10-L DMTPC Directional Dark Matter Detector.
The Dark Matter Time Projection Chamber (DMTPC) is a low pressure (75 Torr CF4) 10 liter detector capable of measuring the vector direction of nuclear recoils with the goal of directional dark matter detection. In this paper we present the first dark matter limit from DMTPC. In an analysis window of 80-200 keV recoil energy, based on a 35.7 g-day exposure, we set a 90% C.L. upper limit on the spin-dependent WIMP-proton cross section of 2.0 x 10^{-33} cm^{2} for 115 GeV/c^2 dark matter particle mass.
S. Ahlen et al
Physics Letters
The case for a directional dark matter detector and the status of current experimental efforts.
We present the case for a dark matter detector with directional sensitivity. This document was developed at the 2009 CYGNUS workshop on directional dark matter detection, and contains contributions from theorists and experimental groups in the field. We describe the need for a dark matter detector with directional sensitivity; each directional dark matter experiment presents their project's status; and we close with a feasibility study for scaling up to a one ton directional detector, which would cost around \$150M.
S. Ahlen et al
International Journal of Modern Physics A

Coursework

(CSE 291) Deep Neural Networks for Identity and Emotion Recognition
We train two neural networks on the POFA and NimStim datasetsto identify individuals and identify emotions, respectively. In order to train these neural networks, we use two separate optimization procedures, the minFunc pack- age and stochastic gradient descent.
William Fedus, Bobak Hashemi, Matthew Burns
(CSE 291) Efficient Encoding Using Deep Neural Networks
Deep neural networks have been used to efficiently encode high-dimensional data into low-dimensional representations. In this report, we attempt to reproduce the results of Hinton and Salakhutdinov. We use Restricted Boltzmann machines to pre-train, and standard backpropagation to fine-tune a deep neural network to show that such a network can efficiently encode images of handwritten digits. We also construct another deep autoencoder using stacked autoencoders and compare the performance of the two autoencoders.
Chaitanya Ryali, Gautam Nallamala, William Fedus and Yashodhara Prabhuzantye
(CSE 250B) Voting, Averaged and Kernalized Perceptrons
We employ various perceptron model for the classification of points.
William Fedus
(CSE 250B) Logistic Regression with Gradient Descent Optimization
Simple logistic regression on data drawn from multivariate gaussians optimized using gradient descent.
William Fedus
(CSE 250B) Handwritten Digit Classification via Generative Model
We use the MNIST data set to create ten 784-dimensional multivariate Gaussians for each of the handwritten digits. Each grayscale handwritten digit may be considered as a 784-dimensional vector.
William Fedus
(CSE 250B) Text Classification via Multinomial Naive Bayes
Using the 20 Newsgroups data set we train a multinomial Naive Bayes classifier to predict the class designation of a particular document composed of $n$ unique features $x_1,\dots,x_n$ where $n$ is the size of the Vocabulary.
William Fedus
(CSE 250B) Nearest Neighbor Classification
We train and test our algorithm on the MNIST dataset of handwritten digits which includes 60,000 training examples and 10,000 test examples. Each handwritten digit is encoded in a 28x28 gray scale image which can be flattened into a 784-dimensional vector, where the intensity of a particular pixel is the value along a certain dimension. In order to reduce the computational complexity of the classification, we seek a smaller set of prototypes within the full 60,000 training set.
William Fedus
(CSE 255) Link Prediction Among Suspected Terrorists
In this paper we train a logistic regression function for two forms of link prediction among a set of 244 suspected terrorists in a social network. We train and test on a dataset created at the University of Maryland and further modified at UCSD by Eric Doi and Ke Tang. The supposed terrorists have several labels for the nature of their links to other supposed terrorists; terrorists are classified as either colleagues, family, contacts, or congregates. Structural information about the known network connectivity of the supposed terrorists is integrated with additional binary information provided about the individuals to arrive at two final models. The first model predicts the existence of any type of link between two individuals and the second model classifies whether an existing link is 'colleague' or 'other'.
Alex Asplund, William Fedus
(CSE 255) Movie Sentiment Analysis
In this paper we train an L1-regularized linear support vector machine (SVM) to determine whether the sentiment of a movie review is positive or negative. We train and test on the movie review polarity dataset introduced by Pang and Lee, 2004. Classification accuracy of the linear SVM is improved through a series of experiments for various data preprocessing techniques and data transformations. Classification accuracy is found to be maximum on the 10 cross-validation folds after removing numerical entries and performing log odds weighting of terms.
Alex Asplund, William Fedus
(CSE 255) Collaborative Filtering Algorithm
In this paper we implement a collaborative filtering algorithm on the MovieLens dataset to predict movie ratings for the users. The original matrix, which contains the movie ratings on a 1-5 scale for the users, has many missing entries. Rank-K factorization is used to construct the filtering algorithm and alternating least squares is then performed on the two lower rank matrices in order to fill in the missing entries of the original matrix and thus predict all movie ratings for all users.
Alex Asplund, William Fedus
(PHYS 210A) Entropy in Classical and Quantum Information Theory
Entropy is a central concept in both classical and quantum information theory, measuring the uncertainty and the information content in the state of a physical system. This paper reviews classical information theory and then proceeds to generalizations into quantum information theory. Both Shannon and Von Neumann entropy are discussed, making the connection to compressibility of a message stream and the generalization of compressibility in a quantum system. Finally, the paper considers the application of Von Neumann entropy in entanglement of formation for both pure and mixed bipartite quantum states.
William Fedus

Random

Surf reports for some of my favorite spots around San Diego, CA: Blacks, Bird Rock and Tourmaline. Also, a brief intro to surf lingo.
Conquering, arguably, one of the most dangerous bike jumps in the world.
Great set of educational links, the No Excuse List.
Andrej Karpathy's Blog The Unreasonable Effectiveness of Recurrent Neural Networks and academic site (the clear inspriration for this site template).
Educational reference for Recurrent Neural Networks (RNNs), including code, theory, applications, etc.
Wait But Why Blog. For an afternoon read, there is a interesting perspective on Elon Musk.
Hinton's Coursera Lectures, full of excellent insights.