Projects

Past Research Projects

ClickNP is a highly flexible and high-performance network processing platform with reconfigurable hardware. Completely programmable using C-like language and Click-like modular programming abstraction. Process packets at up to 200 million packets per second with less than 2µs latency. Published in SIGCOMM’16.

KV-Direct is a high performance key-value store that leverages programmable NIC to extend RDMA primitives and enable remote direct key-value access to the main host memory. A single NIC achieves 180 million key-value operations per second while keeping tail latency below 10µs. Published in SOSP’17.

FTRouter is a fault-tolerant software architecture for SDN routers, which allows any component to fail or upgrade without interrupting data plane, and the control plane can automatically recover. Dissertation for Bachelor’s Degree.

Ongoing Research Projects

SocksDirect is a high performance user-space socket architecture that is compatible with existing applications and preserves isolation among processes, while being scalable to multiple cores and many concurrent connections. Performance close to shared memory for intra-server sockets and RDMA for inter-server.

ROMS (Reliable Ordered Message Scattering) is an efficient and scalable method to scatter groups of messages in serializable order via data center network. With in-network computation using Barefoot or Arista switches, ROMS achieves high performance and fault tolerance with low CPU and network overheads.

Preliminary Research Projects

FTLinux is a transparent and efficient fault tolerant system for general distributed applications on commodity Linux servers. Efficient mechanisms for process migration, deterministic replay and distributed snapshots. Negligible latency and CPU overhead, fast recovery.

ReactDB is a real-time hybrid HTAP and streaming database that offers serializability efficiently. First, each stored procedure transaction is reactive to updates from other concurrent transactions. Second, physical data layout and indexes are reactive to data access pattern.

RDMA NICs have limited memory to store per-flow states. We design a stateless hardware-based transport in data center networks. Instead of storing per-flow states on endpoints, the states are piggybacked by network packets and keep bouncing between two endpoints.

P4Coder is a system to automatically synthesize hardware-accelerated data plane in P4 language by learning the behavior of an existing software network application. Capable of synthesizing data plane of firewall, TCP, key-value store, Paxos and more.

A transparent PCIe bump-in-the-wire debugger and gateway with a commodity FPGA-based PCIe board. Spoofs PCIe devices and corresponding OS drivers to proxy MMIO and DMA traffic via the PCIe gateway.

A library to transparently hide offloading latency by executing non-conflicting work for existing event-driven concurrent applications.

Participated Research Projects

MP-RDMA is a multi-path hardware-based transport for RDMA, which efficiently utilizes the rich network paths in datacenters, and optimizes for limited on-chip memory in RDMA NICs. Published in NSDI’18.

MELO is a memory efficient loss recovery mechanism for hardware-based transport in datacenters. Up to 14x throughput and 3x less 99% tail FCT with only 23B per-flow state. Published in APNet’17.

FUSO is a novel loss recovery approach that exploits multi-path diversity in datacenter networks. Recovery packets are sent over another sub-flow that is not or less lossy. Published in ATC’16.

Feniks is an operating system for FPGA to facilitate large scale FPGA deployment in datacenters. Provides abstracted interface, direct PCIe device access and resource allocation. Published in APSys’17.

Selected Engineering Projects

A scalable and efficient architecture for RSA encryption/decryption on FPGA to accelerate HTTPS handshake. Throughput equivalent to 20 CPU cores.

icourse.club is a website for USTC students to rate and review courses. Since its inception in May 2015, icourse.club has gained 1,000+ users, who generated 3,500+ high-quality reviews and ratings for 1,000+ courses in USTC. Source code.

ClickNP is a highly flexible and high-performance network processing platform with reconfigurable hardware, published in SIGCOMM’16. This talk will share our implementation experience of the ClickNP system, both before and after paper submission.

LUG VPN is a smart global VPN network to enable students efficiently access every host across the Internet from any network location. Users connect to an access gateway nearby, which selects an egress gateway close to destination, forwards to the egress gateway via optimized tunnel.

ClickNP is a highly flexible and high-performance network processing platform with reconfigurable hardware, published in SIGCOMM’16. This talk will share our implementation experience of the ClickNP system, both before and after paper submission.

ClickNP is a highly flexible and high-performance network processing platform with reconfigurable hardware, published in SIGCOMM’16. This talk will share our implementation experience of the ClickNP system, both before and after paper submission.

ClickNP is a highly flexible and high-performance network processing platform with reconfigurable hardware, published in SIGCOMM’16. This talk will share our implementation experience of the ClickNP system, both before and after paper submission.

ClickNP is a highly flexible and high-performance network processing platform with reconfigurable hardware, published in SIGCOMM’16. This talk will share our implementation experience of the ClickNP system, both before and after paper submission.