Distant Domain Transfer Learning

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence

Title: "Distant Domain Transfer Learning"

By

Mr. Ben TAN


Abstract

Transfer learning adapts and reuses knowledge from source domains for a target 
domain. It has attained much popularity in data mining and machine learning, as 
well as many other areas. A major assumption in many transfer learning 
algorithms is that the source and target domains should be closely related. 
This relation can be in the form of related instances, features or models, and 
measured by the KL-divergence or A-distance. However, if two domains are not 
directly related, performing knowledge transfer between these domains will not 
be effective. This source-target domain gap is a serious impediment to the 
successful application of transfer learning.

In this thesis, we study a novel learning problem: Distant Domain Transfer 
Learning (abbreviated to DDTL). In DDTL, we aim to break the large domain gaps 
and transfer knowledge even if the source and target domains share few factors 
directly. For example, the source domain contains plenty of labeled text 
documents but the target domain is composed of image data, they have completely 
different feature spaces; or the source domain classifies face images but the 
target domain distinguishes plane images, they do not share any common 
characteristic in shape or other aspects, they are conceptually distant. The 
DDTL problem is critical and important as solving it can largely expand the 
application scope of transfer learning and help reuse as much previous 
knowledge as possible. Nonetheless, this is a difficult problem as the 
distribution gap between the source domain and the target domain is large.

Inspired by human transitive inference and learning ability, whereby two 
seemingly unrelated concepts can be connected by a string of intermediate 
bridges using auxiliary concepts, in this thesis we propose a novel learning 
framework: transitive transfer learning (abbreviated to TTL). The main idea of 
TTL is to transfer knowledge between distant domains by using some auxiliary 
intermediate data as a bridge. The distant domains can have heterogeneous 
feature spaces or homogeneous feature spaces but distant characteristics, and 
they can be connected by one or multiple intermediate domains. In this thesis, 
we also propose several learning algorithms under the TTL framework, including 
the instance-based, feature-based and model-based algorithms, to tackle the 
DDTL problem with different problem settings, and verify the proposed 
algorithms on some real world data sets.


Date:			Monday, 27 February 2017

Time:			11:00am - 1:00pm

Venue:			Room 4475
 			Lifts 25/26

Chairman:		Prof. Yaping Gong (MGMT)

Committee Members:	Prof. Qiang Yang (Supervisor)
 			Prof. Yangqiu Song
 			Prof. Nevin Zhang
 			Prof. Pascale Fung (ECE)
 			Prof. Wai Lam (Sys Engg & Engg Mgmt, CUHK)


**** ALL are Welcome ****