SOSP 2023 - Symposium on Operating Systems Principles

Accepted Papers

The following papers have been accepted to appear at the 29th ACM SIGOPS Symposium on Operating Systems Principles (SOSP), conditional on the approval of each paper's shepherd:

A Cloud-Scale Characterization of Remote Procedure Calls by Korakit Seemakhupt (University of Virginia), Brent E. Stephens (Google and University of Utah), Samira Khan (Google and University of Virginia), Sihang Liu (University of Waterloo), Hassan Wassel (Google), Soheil Hassas Yeganeh (Google), Alex C. Snoeren (Google and UC San Diego), Arvind Krishnamurthy (Google and University of Washington), David Culler (Google) and Henry M. Levy (Google and University of Washington)

Achieving Microsecond-Scale Tail Latency Efficiently with Approximate Optimal Scheduling by Rishabh Iyer (EPFL), Musa Unal (EPFL), Marios Kogias (Imperial College London) and George Candea (EPFL)

Acto: Automatic End-to-End Testing for Operation Correctness of Cloud System Management by Jiawei Tyler Gu (University of Illinois at Urbana-Champaign), Xudong Sun (University of Illinois at Urbana-Champaign), Wentao Zhang (University of Illinois at Urbana-Champaign), Yuxuan Jiang (University of Illinois at Urbana-Champaign), Chen Wang (IBM Research), Mandana Vaziri (IBM Research), Owolabi Legunsen (Cornell University) and Tianyin Xu (University of Illinois at Urbana-Champaign)

Antipode: Enforcing Cross-Service Causal Consistency in Distributed Applications by João Loff (INESC-ID, Instituto Superior Técnico, Universidade de Lisboa), Daniel Porto (INESC-ID, Instituto Superior Técnico, Universidade de Lisboa), João Garcia (INESC-ID, Instituto Superior Técnico, Universidade de Lisboa), Jonathan Mace (Max Planck Institute for Software Systems and Microsoft Research) and Rodrigo Rodrigues (INESC-ID, Instituto Superior Técnico, Universidade de Lisboa)

Arboretum: A Planner for Large-Scale Federated Analytics with Differential Privacy by Elizabeth Margolin (University of Pennsylvania), Karan Newatia (University of Pennsylvania), Tao Luo (University of Pennsylvania), Edo Roth (University of Pennsylvania) and Andreas Haeberlen (University of Pennsylvania and Roblox)

Automated Verification of an In-Production DNS Authoritative Engine by Naiqian Zheng (Peking University), Mengqi Liu (Alibaba Cloud), Yuxing Xiang (Peking University), Linjian Song (Alibaba Cloud), Dong Li (Alibaba Cloud), Feng Han (Alibaba Cloud), Nan Wang (Alibaba Cloud), Yong Ma (Alibaba Cloud), Zhuo Liang (Alibaba Cloud), Dennis Cai (Alibaba Cloud), Ennan Zhai (Alibaba Cloud), Xuanzhe Liu (Peking University) and Xin Jin (Peking University)

Bagpipe: Accelerating Deep Recommendation Model Training by Saurabh Agarwal (University of Wisconsin-Madison), Chengpo Yan (University of Wisconsin-Madison), Ziyi Zhang (University of Chicago) and Shivaram Venkataraman (University of Wisconsin-Madison)

Blueprint: A Toolchain for Highly-Reconfigurable Microservice Applications by Vaastav Anand (Max Planck Institute for Software Systems), Deepak Garg (Max Planck Institute for Software Systems), Antoine Kaufmann (Max Planck Institute for Software Systems) and Jonathan Mace (Microsoft Research)

Cornflakes: Zero-Copy Serialization for Microsecond-Scale Networking by Deepti Raghavan (Stanford University), Shreya Ravi (Stanford University), Gina Yuan (Stanford University), Pratiksha Thaker (Carnegie Mellon University), Sanjari Srivastava (Stanford University), Micah Murray (UC Berkeley), Pedro Henrique Penna (Microsoft Research), Amy Ousterhout (UC San Diego), Philip Levis (Stanford University and Google), Matei Zaharia (UC Berkeley) and Irene Zhang (Microsoft Research)

Ditto: An Elastic and Adaptive Memory-Disaggregated Caching System by Jiacheng Shen (The Chinese University of Hong Kong), Pengfei Zuo (Huawei Cloud), Xuchuan Luo (Fudan University), Yuxin Su (Sun Yat-sen University), Jiazhen Gu (The Chinese University of Hong Kong), Hao Feng (Huawei Cloud), Yangfan Zhou (Fudan University) and Michael R. Lyu (The Chinese University of Hong Kong)

Edna: Disguising and Revealing User Data in Web Applications by Lillian Tsai (MIT), Hannah Gross (Brown University), Eddie Kohler (Harvard University), Frans Kaashoek (MIT) and Malte Schwarzkopf (Brown University)

Efficient Memory Management for Large Language Model Serving with PagedAttention by Woosuk Kwon (UC Berkeley), Zhuohan Li (UC Berkeley), Siyuan Zhuang (UC Berkeley), Ying Sheng (Stanford University), Lianmin Zheng (UC Berkeley), Cody Hao Yu (Independent Researcher), Joseph Gonzalez (UC Berkeley), Hao Zhang (UC San Diego) and Ion Stoica (UC Berkeley)

Enabling High-Performance and Secure Userspace NVM File Systems with the Trio Architecture by Diyu Zhou (EPFL), Vojtech Aschenbrenner (EPFL), Tao Lyu (EPFL), Jian Zhang (Rutgers University), Sudarsun Kannan (Rutgers University) and Sanidhya Kashyap (EPFL)

FIFO queues are all you need for cache eviction by Juncheng Yang (Carnegie Mellon University), Yazhuo Zhang (Emory University), Ziyue Qiu (Carnegie Mellon University), Yao Yue (Pelikan Foundation) and Rashmi Vinayak (Carnegie Mellon University)

Falcon: Fast OLTP Engine for Persistent Cache and Non-Volatile Memory by Zhicheng Ji (Tsinghua University), Kang Chen (Tsinghua University and Zhongguancun Laboratory), Leping Wang (Tsinghua University), Mingxing Zhang (Tsinghua University) and Yongwei Wu (Tsinghua University)

Flexible Advancement in Asynchronous BFT Consensus by Shengyun Liu (Shanghai Jiao Tong University), Wenbo Xu (Blockchain Platform Division, Ant Group), Chen Shan (Blockchain Platform Division, Ant Group), Xiaofeng Yan (Blockchain Platform Division, Ant Group), Tianjing Xu (Blockchain Platform Division, Ant Group), Bo Wang (Blockchain Platform Division, Ant Group), Lei Fan (Shanghai Jiao Tong University), Fuxi Deng (Blockchain Platform Division, Ant Group), Ying Yan (Blockchain Platform Division, Ant Group) and Hui Zhang (Blockchain Platform Division, Ant Group)

GEMINI: Fast Failure Recovery in Distributed Training with In-Memory Checkpoints by Zhuang Wang (Rice University), Zhen Jia (Amazon Web Services, Inc.), Shuai Zheng (Amazon Web Services), Zhen Zhang (Amazon Web Services), Xinwei Fu (Amazon Web Services), T. S. Eugene Ng (Rice University) and Yida Wang (Amazon)

Grove: a Separation-Logic Library for Verifying Distributed Systems by Upamanyu Sharma (Massachusetts Institute of Technology), Ralf Jung (ETH Zurich), Joseph Tassarotti (New York University), Frans Kaashoek (MIT) and Nickolai Zeldovich (MIT)

Halfmoon: Log-Optimal Fault-Tolerant Stateful Serverless Computing by Sheng Qi (Peking University), Xuanzhe Liu (Peking University) and Xin Jin (Peking University)

MEMTIS: Efficient Memory Tiering with Dynamic Page Classification and Page Size Determination by Taehyung Lee (Sungkyunkwan University), Sumit Kumar Monga (Virginia Tech), Changwoo Min (Igalia) and Young Ik Eom (Dept. of Electrical and Computer Engineering / College of Computing and Informatics, Sungkyunkwan University)

Mira: A Program-Behavior-Guided Far Memory System by Zhiyuan Guo (University of California, San Diego), Zijian He (University of California, San Diego) and Yiying Zhang (University of California, San Diego)

One Simple API Can Cause Hundreds of Bugs An Analysis of Refcounting Bugs in All Modern Linux Kernels by Liang He (Institute of Software, CAS China), Purui Su (Institute of Software, CAS China), Chao Zhang (Tsinghua University), Yan Cai (Institute of Software, Chinese Academy of Sciences) and Jinxin Ma (CNITSEC)

Oobleck: Resilient Distributed Training of Large Models Using Pipeline Templates by Insu Jang (University of Michigan), Zhenning Yang (University of Michigan), Zhen Zhang (Amazon Web Services), Xin Jin (Peking University) and Mosharaf Chowdhury (University of Michigan)

PIT: Optimization of Dynamic Sparse Deep Learning Models via Permutation Invariant Transformation by Ningxin Zheng (Microsoft Research), Huiqiang Jiang (Microsoft Research), Quanlu Zhang (Microsoft Research), Zhenhua Han (Microsoft Research), Lingxiao Ma (Microsoft Research), Yuqing Yang (Microsoft Research), Fan Yang (Microsoft Research), Chengruidong Zhang (Microsoft Research), Lili Qiu (Microsoft Research), Mao Yang (Microsoft Research) and Lidong Zhou (Microsoft Research)

PVM: Efficient Shadow Paging for Deploying Secure Containers in Cloud-native Environment by Hang Huang (Alibaba Group), Jiangshan Lai (Ant Group), Jia Rao (The University of Texas at Arlington), Hui Lu (The University of Texas at Arlington), Wenlong Hou (Ant Group), Hang Su (Ant Group), Quan Xu (Alibaba Group), Jiang Zhong (Alibaba Group), Jiahao Zeng (Alibaba Group), Xu Wang (Ant Group), Zhengyu He (Ant Group), Weidong Han (Alibaba Group), Jiang Liu (Alibaba Group), Tao Ma (Alibaba Group) and Song Wu (Huazhong University of Science and Technology)

Paella: Low-latency Model Serving with Software-defined GPU Scheduling by Kelvin K.W. Ng (University of Pennsylvania), Henri Maxime Demoulin (DBOS, inc) and Vincent Liu (University of Pennsylvania)

Partial Failure Resilient Memory Management System for (CXL-based) Distributed Shared Memory by Mingxing Zhang (Tsinghua University), Teng Ma (Alibaba Group), Jinqi Hua (Tsinghua University), Zheng Liu (Alibaba Group), Kang Chen (Tsinghua University), Ning Ding (Alibaba Group), Fan Du (Intel), Jinlei Jiang (Tsinghua University), Tao Ma (Alibaba Group) and Yongwei Wu (Tsinghua University)

Private Web Search with Tiptoe by Alexandra Henzinger (MIT), Emma Dauterman (UC Berkeley), Henry Corrigan-Gibbs (MIT) and Nickolai Zeldovich (MIT)

Project Silica: Towards Sustainable Cloud Archival Storage in Glass by Patrick Anderson (Microsoft), Erika Blancada Aranas (Microsoft), Youssef Assaf (Microsoft), Raphael Behrendt (Microsoft), Richard Black (Microsoft), Marco Caballero (Microsoft), Pashmina Cameron (Microsoft), Burcu Canakci (Microsoft), Thales de Carvalho (Microsoft), Andromachi Chatzieleftheriou (Microsoft), Rebekah Storan Clarke (Microsoft), James Clegg (Microsoft), Daniel Cletheroe (Microsoft), Bridgette Cooper (Microsoft), Tim Deegan (Microsoft), Austin Donnelly (Microsoft), Rokas Drevinskas (Microsoft), Alexander Gaunt (Microsoft), Christos Gkantsidis (Microsoft), Ariel Gomez Diaz (Microsoft), Istvan Haller (Microsoft), Freddie Hong (Microsoft), Teodora Ilieva (Microsoft), Shashidhar Joshi (Microsoft), Russell Joyce (Microsoft), Mint Kunkel (Microsoft), David Lara (Microsoft), Sergey Legtchenko (Microsoft), Fanglin Linda Liu (Microsoft), Bruno Magalhaes (Microsoft), Alana Marzoev (Microsoft), Marvin McNett (Microsoft), Jayashree Mohan (Microsoft), Michael Myrah (Microsoft), Trong Nguyen (Microsoft), Sebastian Nowozin (Microsoft), Aaron Ogus (Microsoft), Hiske Overweg (Microsoft), Antony Rowstron (Microsoft), Maneesh Sah (Microsoft), Masaaki Sakakura (Microsoft), Peter Scholtz (Microsoft), Nina Schreiner (Microsoft), Omer Sella (Microsoft), Adam Smith (Microsoft), Ioan Stefanovici (Microsoft), David Sweeney (Microsoft), Benn Thomsen (Microsoft), Govert Verkes (Microsoft), Phil Wainman (Microsoft), Jonathan Westcott (Microsoft), Luke Weston (Microsoft), Charles Whittaker (Microsoft), Pablo Wilke Berenguer (Microsoft), Hugh Williams (Microsoft), Thomas Winkler (Microsoft) and Stefan Winzeck (Microsoft)

Pushing Performance Isolation Boundaries into Application with pBox by Yigong Hu (Johns Hopkins University), Gongqi Huang (Johns Hopkins University) and Peng Huang (University of Michigan)

QuePaxa: Escaping the Tyranny of Timeouts in Consensus by Pasindu Tennage (Ecole Polytechnique Federale de Lausanne (EPFL)), Cristina Basescu (Ecole Polytechnique Federale de Lausanne (EPFL)), Lefteris Kokoris-Kogias (IST Austria, Mysten Labs), Ewa Syta (Trinity College), Philipp Jovanovic (UCL), Vero Estrada-Galinanes (Ecole Polytechnique Federale de Lausanne (EPFL)) and Bryan Ford (Ecole Polytechnique Federale de Lausanne (EPFL))

RackBlox: A Software-Defined Rack-Scale Storage System with Network-Storage Co-Design by Benjamin Reidys (University of Illinois Urbana Champaign), Yuqi Xue (University of Illinois Urbana Champaign), Daixuan Li (University of Illinois Urbana Champaign), Bharat Sukhwani (IBM Research), Wen-mei Hwu (University of Illinois Urbana Champaign), Deming Chen (University of Illinois Urbana Champaign), Sameh Asaad (IBM Research) and Jian Huang (University of Illinois Urbana Champaign)

SPFresh: Incremental In-Place Update for Billion-Scale Vector Search by Yuming Xu (University of Science and Technology of China & Microsoft Research), Hengyu Liang (University of Science and Technology of China), Jin Li (Harvard University), Shuotao Xu (Microsoft Research), Qi Chen (Microsoft Research), Qianxi Zhang (Microsoft Research), Cheng Li (University of Science and Technology of China), Ziyue Yang (Microsoft Research), Fan Yang (Microsoft Research Asia), Yuqing Yang (Microsoft Research), Peng Cheng (Microsoft Research) and Mao Yang (Microsoft Research)

Sia: Heterogeneity-aware, goodput-optimized ML-cluster scheduling by Suhas Jayaram Subramanya (Carnegie Mellon University), Daiyaan Arfeen (Carnegie Mellon University), Shouxu Lin (Cornell University), Aurick Qiao (Petuum, Inc.), Zhihao Jia (Carnegie Mellon University) and Gregory R. Ganger (Carnegie Mellon University)

Siloz: Leveraging DRAM Isolation Domains to Prevent Inter-VM Rowhammer by Kevin Loughlin (University of Michigan), Jonah Rosenblum (University of Michigan), Stefan Saroiu (Microsoft), Alec Wolman (Microsoft), Dimitrios Skarlatos (Carnegie Mellon University) and Baris Kasikci (University of Washington and Google)

Snowcat: Efficient Kernel Concurrency Testing using a Learned Coverage Predictor by Sishuai Gong (Purdue University), Dinglan Peng (Purdue University), Deniz Altınbüken (Google DeepMind), Pedro Fonseca (Purdue University) and Petros Maniatis (Google DeepMind)

TreeSLS: A Whole-system Persistent Microkernel with Tree-structured State Checkpoint on NVM by Fangnuo Wu (Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University), Mingkai Dong (Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University), Gequan Mo (Shanghai Jiao Tong University) and Haibo Chen (Shanghai Jiao Tong University)

Turbo: Effective Caching in Differentially-Private Databases by Kelly Kostopoulou (Columbia University), Pierre Tholoniat (Columbia University), Asaf Cidon (Columbia University), Roxana Geambasu (Columbia University) and Mathias Lécuyer (University of British Columbia)

UGACHE: A Unified GPU Cache for Embedding-based Deep Learning by Xiaoniu Song (Shanghai Jiao Tong University; Shanghai Artificial Intelligence Laboratory), Yiwen Zhang (Shanghai Jiao Tong University), Rong Chen (Shanghai Jiao Tong University; Shanghai Artificial Intelligence Laboratory) and Haibo Chen (Shanghai Jiao Tong University)

Understanding Silent Data Corruptions in a Large Production CPU Population by Shaobu Wang (Tsinghua University), Guangyan Zhang (Tsinghua University), Junyu Wei (Tsinghua University), Yang Wang (Meta/The Ohio State University), Jiesheng Wu (Alibaba Cloud) and Qingchao Luo (Alibaba Cloud)

Validating JIT Compilers via Compilation Space Exploration by Cong Li (Nanjing University), Yanyan Jiang (Nanjing University), Chang Xu (Nanjing University) and Zhendong Su (ETH Zurich)

XFaaS: Hyperscale and Low Cost Serverless Functions at Meta by Alireza Sahraei (Meta Platforms, Inc), Soteris Demetriou (Imperial College London, Meta Platforms), Amirali Sobhgol (Meta Platforms, Inc), Haoran Zhang (University of Pennsylvania), Abhigna Nagaraja (Meta Platforms, Inc), Neeraj Pathak (Meta Platforms, Inc), Girish Joshi (Meta Platforms, Inc), Carla Souza (Meta Platforms, Inc), Bo Huang (Meta Platforms, Inc), Wyatt Cook (Meta Platforms, Inc), Andrii Golovei (Meta Platforms, Inc), Pradeep Venkat (Meta Platforms, Inc), Andrew McFague (Meta Platforms, Inc), Dimitrios Skarlatos (Carnegie Mellon University, Meta Platforms), Vipul Patel (Meta Platforms, Inc), Ravinder Thind (Meta Platforms, Inc), Ernesto Gonzalez (Meta Platforms, Inc), Yun Jin (Meta Platforms, Inc) and Chunqiang Tang (Meta Platforms, Inc)

gSampler: General and Efficient GPU-based Graph Sampling for Graph Learning by Ping Gong (University of Science and Technology of China), Renjie Liu (Southern University of Science and Technology), Zunyao Mao (Southern University of Science and Technology), Zhenkun Cai (AWS AI Shanghai Lab), Xiao Yan (Southern University of Science and Technology), Cheng Li (University of Science and Technology of China), Minjie Wang (AWS AI Shanghai Lab) and Zhuozhao Li (Southern University of Science and Technology)