Publications

Publications by categories in reversed chronological order.

2025

  1. ____ATC____
    Towards Optimal Rack-scale μs-level CPU Scheduling through In-Network Workload Shaping
    Xudong Liao, Han Tian, Xinchen Wan, Chaoliang Zeng, Hao Wang, Junxue Zhang, and 3 more authors
    In Proceedings of USENIX Annual Technical Conference (ATC 2025) , 2025
  2. __ASPLOS__
    Harmonia: A Unified Framework for Heterogeneous FPGA Acceleration in the Cloud
    Luyang Li, Heng Pan, Xinchen Wan, Kai Lv, Zilong Wang, Qian Zhao, and 6 more authors
    In Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2025) , 2025
  3. __INFOCOM__
    A Generic and Efficient Communication Framework for Message-level In-Network Computing
    Xinchen Wan, Luyang Li, Han Tian, Xudong Liao, Xinyang Huang, Chaoliang Zeng, and 7 more authors
    In Proceedings of the IEEE International Conference on Computer Communications (INFOCOM 2025) , 2025
  4. __ASPLOS__
    Design and Operation of Shared Machine Learning Clusters on Campus
    Kaiqiang Xu, Decang Sun, Hao Wang, Zhenghang Ren, Xinchen Wan, Xudong Liao, and 3 more authors
    In Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2025) , 2025
  5. __EuroSys__
    Achieving Fairness Generalizability for Learning-based Congestion Control with Jury
    Han Tian, Xudong Liao, Decang Sun, Chaoliang Zeng, Yilun Jin, Junxue Zhang, and 4 more authors
    In Proceedings of the 20th ACM European Conference on Computer Systems (EuroSys 2025) , 2025

2024

  1. __SIGCOMM__
    Fast, Scalable, and Accurate Rate Limiter for RDMA NICs
    Zilong Wang, Xinchen Wan, Luyang Li, Yijun Sun, Peng Xie, Xin Wei, and 3 more authors
    In Proceedings of the ACM Special Interest Group on Data Communication (SIGCOMM 2024) , 2024
  2. __EuroSys__
    Astraea: Towards Fair and Efficient Learning-based Congestion Control
    Xudong Liao, Han Tian, Chaoliang Zeng, Xinchen Wan, and Kai Chen
    In Proceedings of the 19th ACM European Conference on Computer Systems (EuroSys 2024) , 2024
  3. ___NSDI___
    Accelerating Neural Recommendation Training with Embedding Scheduling
    Chaoliang Zeng, Xudong Liao, Xiaodian Cheng, Han Tian, Xinchen Wan, Hao Wang, and 1 more author
    In Proceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 2024) , 2024
  4. ___NSDI___
    Towards Domain-Specific Network Transport for Distributed DNN Training
    Hao Wang, Han Tian, Jingrong Chen, Xinchen Wan, Jiacheng Xia, Gaoxiong Zeng, and 4 more authors
    In Proceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 2024) , 2024

2023

  1. __APNET__
    Accurate and Scalable Rate Limiter for RDMA NICs
    Zilong Wang, Xinchen Wan, Chaoliang Zeng, and Kai Chen
    In Proceedings of the 7th Asia-Pacific Workshop on Networking (APNet 2023) , 2023
  2. __SIGMOD__
    Scalable and Efficient Full-Graph GNN Training for Large Graphs
    Xinchen Wan, Kaiqiang Xu, Xudong Liao, Yilun Jin, Kai Chen, and Xin Jin
    In Proceedings of the ACM on Management of Data (SIGMOD 2023) , 2023
  3. ___NSDI___
    SRNIC: A scalable architecture for RDMA NICs
    Zilong Wang, Layong Luo, Qingsong Ning, Chaoliang Zeng, Wenxue Li, Xinchen Wan, and 5 more authors
    In Proceedings of the 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 2023) , 2023

2022

  1. ___ICNP___
    DGS: Communication-Efficient Graph Sampling for Distributed GNN Training
    Xinchen Wan, Kai Chen, and Yiming Zhang
    In Proceedings of the 30th IEEE International Conference on Network Protocols (ICNP 2022) , 2022

2021

  1. ___ArXiv___
    Tacc: A full-stack cloud computing infrastructure for machine learning tasks
    Kaiqiang Xu, Xinchen Wan, Hao Wang, Zhenghang Ren, Xudong Liao, Decang Sun, and 2 more authors
    arXiv preprint arXiv:2110.01556, 2021

2020

  1. ___ArXiv___
    Domain-specific communication optimization for distributed DNN training
    Hao Wang, Jingrong Chen, Xinchen Wan, Han Tian, Jiacheng Xia, Gaoxiong Zeng, and 4 more authors
    arXiv preprint arXiv:2008.08445, 2020
  2. __APNET__
    Rat-resilient allreduce tree for distributed machine learning
    Xinchen Wan, Hong Zhang, Hao Wang, Shuihai Hu, Junxue Zhang, and Kai Chen
    In Proceedings of the 4th Asia-Pacific Workshop on Networking (APNet 2020) , 2020