John Dunagan

SENIOR PRINCIPAL

.

August 2023

From February 2012 onwards, I have been working at Amazon Web Services, first as a Principal Research Scientist, then as a Principal Algorithm Engineer, and most recently as a Senior Principal (as of April 2020). During this time, I have worked on AWS Fraud Prevention, Amazon Kinesis, EC2 Security, the EC2 Control Plane, EC2 Capacity Management, AWS Supply Chain Automation, and the AWS Commerce Platform (my current focus).

Prior to Amazon, I worked for Microsoft for nine and a half years. From October 2009 to early February 2012, I worked in the High Performance Computing Team, first as a Principal Architect (October 2009-November 2011) then briefly as a Principal Program Manager (November 2011-Feburary 2012). During this time, the High Performance Computing team was first part of the Technical Computing Group and then later part of Windows Azure. The Technical Computing Group focused on building software that helps scientists and engineers to do modeling and simulation. Windows Azure provides a public cloud for hosting computation and data. On the High Performance Computing team, I was responsible for the productization of DSC, Dryad, and DryadLINQ, a suite of technologies related to the Big Data trend.

From June 2002 to October 2009, I worked in different groups within Microsoft Research, including the Networking Research Group, the Distributed Systems and Security Group, the Theory Group and the Systems Management Group.

This webpage primarily contains links to my publications. My CV contains other information, such as professional activities, and is available here.

In May of 2002, I completed my PhD in Applied Mathematics at the Massachusetts Institute of Technology. My work was in algorithms under the supervision of Professor Santosh Vempala. We spent a lot of our time in the Laboratory for Computer Science which has a very collegial Theory of Computation Group.

My recent work:

    Machine Learning to Diagnose Failures in Server Tiers:

Bilinear Logistic Regression for Factored Diagnosis Problems. Appeared in Workshop on Managing Large-Scale Systems via the Analysis of System Logs and the Application of Machine Learning Techniques (SLAML) 2011 and later in the journal Operating Systems Review (OSR). Joint work with Sumit Basu, Kevin Duh and Kiran-Kumar Muniswamy-Reddy.

    Machine Learning to Partition Graphs:

Active Graph Reachability Reduction for Network Security and Software Engineering. Appeared in International Joint Conference on Artificial Intelligence (IJCAI) 2011. Joint work with Alice Zheng and Ashish Kapoor.

    Connecting Middle Tiers to Scale-out Storage:

Stout: An Adaptive Interface to Scalable Cloud Storage. Appeared in USENIX Annual Technical Conference 2010. Joint work with Alec Wolman, Alex Snoeren and John McCullough.

    Software Infrastructure for Datacenter Services:

Centrifuge: Integrated Lease Management and Partitioning for Cloud Services. Appeared in Symposium on Networked Systems Design and Implementation (NSDI) 2010. Joint work with Atul Adya and Alec Wolman.

    Data Placement for Cloud Services:

Volley: Automated Data Placement for Geo-Distributed Cloud Services. Appeared in Symposium on Networked Systems Design and Implementation (NSDI) 2010. Joint work with Sharad Agarwal, Alec Wolman, Navendu Jain, Stefan Saroiu and Harbinder Bhogan.

    Managing Enterprise Security Configuration:

Heat-ray: Combating identity snowball attacks using machine learning, combinatorial optimization and attack graphs. Appeared in Symposium on Operating Systems Principles (SOSP) 2009. Joint work with Alice Zheng and Dan Simon.

    Characterizing Botnets from Spam:

Characterizing Botnets from Email Spam Records. Appeared in Workshop on Large-Scale Exploits and Emergent Threats (LEET) 2008. Joint work with Li Zhuang, Daniel R. Simon, Helen J. Wang, Ivan Osipkov, Geoff Hulten and J. D. Tygar.

    Investigating PC Unresponsiveness:

Why Did My PC Suddenly Slow Down? Appeared in Workshop on Tackling Computer Systems Problems with Machine Learning Techniques (SysML) 2007. Joint work with Sumit Basu and Greg Smith.

    Solving Systems of Linear Equations:

Iteratively Constructing Preconditioners via the Conjugate Gradient Method. Appeared in Symposium on Theory of Computing (STOC) 2007. Joint work with Nicholas J. A. Harvey.

    Infrastructure for Analyzing Network Protocols:

Generic Application-Level protocol Analyzer and its Language. Appeared in Network and Distributed System Security (NDSS) 2007. Joint work with David Brumley, Nikita Borisov, Helen Wang, Pallavi Joshi and Chuanxiong Guo.

    Protecting Web Browsers using JavaScript Virtualization:

BrowserShield: Vulnerability-Driven Filtering of Dynamic HTML. Appeared in Operating Systems Design and Implementation (OSDI) 2006. Joint work with Charlie Reis, Helen Wang, Opher Dubrovsky, and Saher Esmeir.

    Measuring Shellcode Polymorphism:

Finding Diversity in Remote Code Injection Exploits. Appeared in Internet Measurement Conference (IMC) 2006. Joint work with Justin Ma, Helen Wang, Stefan Savage, and Geoffrey Voelker.

    Applying Machine Learning to Analyzing Network Protocols:

Automatically Extracting Fields from Unknown Network Protocols. Appeared in Systems and Machine Learning Workshop (SysML) 2006. Joint work with Karthik Gopalratnam, Sumit Basu, and Helen Wang.

    Security Access Tracing:

A Black-Box Tracing Technique to Identify Causes of Least-Privilege Incompatibilities. Appeared in Network and Distributed System Security Symposium (NDSS) 2005. Joint work with Shuo Chen, Chad Verbowski, and Yi-Min Wang.

    Failure Detection and Notification:

FUSE: Lightweight Guaranteed Distributed Failure Notification. Appeared in Operating Systems Design and Implementation (OSDI) 2004. Joint work with Nicholas J. A. Harvey, Michael B. Jones, Dejan Kostic, Marvin Theimer, and Alec Wolman.

    Channel Hopping Strategies for Wireless Networks:

SSCH: Slotted Seeded Channel Hopping for Capacity Improvement in IEEE 802.11 Ad-Hoc Wireless Networks. Appeared in International Conference on Mobile Computing and Networking (Mobicom) 2004. Joint work with Ranveer Chandra and Victor Bahl. This and related technologies are discussed in more detail on the MultiNet home page.

    Black-box Troubleshooting:

STRIDER: A Black-box, State-based Approach to Change and Configuration Management and Support. Appeared in Usenix Large Installation System Administration Conference (LISA) 2003 (Best Paper). Joint work with Yi-Min Wang, Chad Verbowski, Yu Chen, Helen J. Wang, Chun Yuan, and Zheng Zhang.

    Patch Management:

Towards A Self-Managing Software Patching Process Using Black-Box Persistent State Manifests. Appeared in International Conference on Autonomic Computing (ICAC) 2004. Joint work with Roussi Roussev, Brad Daniels, Aaron Johnson, Chad Verbowski, and Yi-Min Wang.

    Peer-to-peer Overlay Networks:

SkipNet: A Scalable Overlay Network with Practical Locality Properties. Appeared as Microsoft Research Technical Report. Joint work with Nicholas J. A. Harvey, Michael B. Jones, Stefan Saroiu, Marvin Theimer, and Alec Wolman.

    Overlay Multicast Trees:

Subscriber/Volunteer Trees: Polite, Efficient Overlay Multicast Trees. Appeared as Microsoft Research Technical Report. Joint work with Nicholas J. A. Harvey, Michael B. Jones, Marvin Theimer, and Alec Wolman.

    Experience Implementing a Peer-to-peer System

Engineering Realities of Building a Working Peer-to-Peer System. Appeared as Microsoft Research Technical Report. Joint work with Michael B. Jones.

My earlier work in algorithms:

    Outliers:

Optimal Outlier Removal in High-Dimensional Space. Invited to appear in Journal of Computer and System Sciences (JCSS). Joint work with Santosh Vempala.

An older version of this paper appeared in Symposium on Theory of Computing (STOC) 2001. The current version corrects some statements from the older version.

    Linear Programming:

Smoothed Analysis of the Perceptron Algorithm for Linear Programming. Joint work with Avrim Blum. Appeared in Symposium on Discrete Algorithms (SODA) 2002, and invited to appear in Journal of Algorithms.

Smoothed Analysis of the Renegar's Condition Number for Linear Programming. Appeared in SIAM Conference on Optimization (SIOPT) 2002. Joint work with Dan Spielman and Shang-Hua Teng.

A Polynomial-time Rescaling Algorithm for Solving Linear Programs. Appeared in Symposium on Theory of Computing (STOC) 2004, and then in the journal Mathematical Programming: Series A. Joint work with Santosh Vempala.

    Thesis Work:

A Geometric Theory of Outliers and Perturbation. The thesis contains most of the work in the paper on outliers and the two papers on smoothed analysis of linear programming plus some additional discussion.

    Approximation Algorithms:

On Euclidean Embeddings and Bandwidth Minimization. Appeared in Workshop on Randomization and Computation (RANDOM) 2001. Joint work with Santosh Vempala.