Bilingual Category Induction with Recursive Neural Network

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "Bilingual Category Induction with Recursive Neural Network"

By

Mr. Yuchen YAN


Abstract:

Inversion Transduction Grammar (ITG) is capable of symbolically representing a
transduction relationship with much more explanatory power compared to
mainstream neural network approaches. However inducing an ITG from parallel
corpus still remains a challenging problem, in which the main bottleneck lies
in category induction. The main obstacle is that the space of possible ITG
rules grows exponentially with the category size, making symbolic search
infeasible.

Later Tranductive Recursive Auto-Associative Memory (TRAAM) showed the
possibility of hinting useful categories with the help of recursive neural
network, but the quality of the produced categories are far from ideal. We
observe that TRAAM suffers from vanishing gradient problems, and the
self-reconstruction training objective of TRAAM do not align with contextually
appropriate bilingual categorization. For vanishing gradient problems, most
mainstream methods like LSTM, GRU, skip connections and Gated Linear Unit
either require sequential input topologies or have unbounded output range thus
cannot be used for biparse trees. So we design our own recursive network
architecture. For training objectives, we argue that context-aware objectives
suits better than self-reconstruction objectives. Among common context-aware
objectives, sibling token prediction can be more easily generalized to work
with biparse trees than autoreggressive objective used in GPT or sibling
sentence prediction objective used in BERT.

Even with our improved method to hint bilingual categories, inducing a
categorized ITG still remains challenging because TRAAM based bilingual
category hinting relies on high quality biparse tree topologies, but high
quality biparse tree topologies relies on high quality categorized ITG. To
solve this chicken and egg problem, we will introduce what we call a feedback
training pipeline that co-trains both the categorized ITG and our improved
TRAAM network at the same time.

Implementationally, mainstream deep learning frameworks like Tensorflow and
Pytorch lack the flexibility to run our feedback training pipeline which
requires integrating symbolic ITG biparsing algorithm into the training loop.
So we also develop our own deep learning framework in C++.


Date:                   Monday, 21 August 2023

Time:                   2:00pm - 4:00pm

Venue:                  Room 3494
                        Lifts 25/26

Chairman:               Prof. Amy DALTON (MARK)

Committee Members:      Prof. Dekai WU (Supervisor)
                        Prof. Andrew HORNER
                        Prof. Nevin ZHANG
                        Prof. Tao LIU (PHYS)
                        Prof. Lei SHA (Beihang University)


**** ALL are Welcome ****