TOWARDS ENHANCING RELIABILITY AND PERFORMANCE OF DATA-CENTRIC SYSTEMS WITH STATIC ANALYSIS

PhD Thesis Proposal Defence


Title: "TOWARDS ENHANCING RELIABILITY AND PERFORMANCE OF DATA-CENTRIC SYSTEMS
WITH STATIC ANALYSIS"

by

Mr. Chengpeng WANG


Abstract:

In the era of big data, data-centric systems have become the backbone of our
computing infrastructure. These systems offer a wide range of services in our
daily lives by processing various forms of data. The prevalence of data-centric
systems highlights the importance of improving their reliability and
performance. Unreliable or inefficient data-centric systems can lead to
unanticipated economic losses or consume unnecessary computation resources,
which threatens property safety and service experience.

Static analysis is a technique that analyzes program behaviors without
executing programs. Over several decades, the community has achieved great
success in different domains, such as program optimization, bug detection, and
program synthesis. However, little attention has been paid to data-centric
systems, leaving their reliability and performance issues insufficiently
addressed.

This thesis presents our efforts in enhancing the reliability and performance
of datacentric systems using static analysis. Our approach addresses
data-centric system analysis from two perspectives, namely the application and
data sides, for detecting bugs and optimizing programs in a systematic manner.

Our first focus is on ubiquitous data structures, called containers, from the
application side of data-centric systems. As erroneous value flows can be
propagated through containers, a static analyzer must precisely reason about
container memory layout, which can cause significant overhead in analyzing
large-scale systems. To address this, we introduce ANCHOR, which uses memory
orientation analysis to apply strong updates upon container memory layouts and
conducts a demand-driven reachability analysis in the valueflow graph.
ANCHORcan support various value-flow analysis clients, such as program slicing
and value-flow bug detection, achieving both high precision and efficiency.

Our second work improves the system performance from the application side by
optimizing container usage. We present CRES, a container replacement
synthesizer that detects and replaces inefficient container usage. It
statically identifies container usage and selects a method with lower time
complexity for each container method call. CRES can preserve program behavior
and effectively reduce execution time for general inputs.

Our third work targets domain-specific programs, called data constraints, from
the data side of the systems. These programs are executed over database tables
to monitor data consistency, but equivalent data constraints widely exist,
leading to the waste of computation resources. To address this issue, we
present EQDAC, an efficient decision procedure that refutes and proves the
equivalence of data constraints in polynomial time using two lightweight
analyses. EQDAC supports equivalence searching and clustering efficiently,
resolving redundant data constraints and improving system performance.


Date:                   Tuesday, 22 August 2023

Time:                   1:00pm - 3:00pm

Venue:                  Room 3494
                        lifts 25/26

Committee Members:      Prof. Charles Zhang (Supervisor)
                        Dr. Shuai Wang (Chairperson)
                        Dr. Lionel Parreaux
                        Dr. Jiasi Shen


**** ALL are Welcome ****