# Open Topics

We offer multiple Bachelor/Master theses, Guided Research projects and IDPs in the area of data mining/machine learning. A *non-exhaustive* list of open topics is listed below.

If you are interested in a thesis or a guided research project, please send your CV and transcript of records to Prof. Stephan Günnemann via email and we will arrange a meeting to talk about the potential topics.

### Temporal Point Processes

**Type:** Master's thesis / guided research

**Prerequisites: **

- Strong machine learning knowledge
- Proficiency with deep learning frameworks (Tensorflow or Pytorch)
- (Preferred) Experience with deep sequence models (e.g. RNNs, 1-d convolutional networks, transformers)
- (Preferred) Experience with normalizing flows

**Description: **

Visits to hospitals, purchases in e-commerce systems, financial transactions, posts in social media — various forms of human activity can be represented as discrete events happening at irregular intervals. Temporal point processes (TPP) provide a natural framework for modeling such data. Unfortunately, most existing TPP approaches either lack flexibility or scalability necessary for handling complex high-dimensional real-world datasets. In this thesis, you will use deep learning techniques to design new TPP models that are both efficient and flexibile. You will use datasets from different domains (such as communication, e-commerce, server logs) to evaluate the new approaches on various prediction tasks.

**Contact:** Oleksandr Shchur

**References:**

### Graph Neural Networks

**Type:** Master's thesis / Bachelor's thesis / guided research

**Prerequisites: **

- Strong machine learning knowledge
- Proficiency with Python and deep learning frameworks (TensorFlow or PyTorch)
- Knowledge of graph neural networks (e.g. GCN, MPNN)
- Knowledge of graph/network theory

**Description:**

Graph neural networks (GNNs) have recently achieved great successes in a wide variety of applications, such as chemistry, reinforcement learning, knowledge graphs, traffic networks, or computer vision. These models leverage graph data by updating node representations based on messages passed between nodes connected by edges, or by transforming node representation using spectral graph properties. These approaches are very effective, but many theoretical aspects of these models remain unclear and there are many possible extensions to improve GNNs and go beyond the nodes' direct neighbors and simple message aggregation.

**Contact:** Johannes Klicpera

**References:**

- Semi-supervised classification with graph convolutional networks
- Relational inductive biases, deep learning, and graph networks
- Diffusion Improves Graph Learning
- Weisfeiler and leman go neural: Higher-order graph neural networks

### Deep Learning for Molecules

**Type:** Master's thesis / guided research

**Prerequisites: **

- Strong machine learning knowledge
- Proficiency with Python and deep learning frameworks (TensorFlow or PyTorch)
- Knowledge of graph neural networks (e.g. GCN, SchNet)
- Optional: Knowledge of machine learning on molecules and quantum chemistry

**Description:**

Deep learning models, especially graph neural networks (GNNs), have recently achieved great successes in predicting quantum mechanical properties of molecules. There is a vast amount of applications for these models, such as finding the best method of chemical synthesis or selecting candidates for drugs, construction materials, batteries, or solar cells. However, most of these models have only been proposed in recent years and there remain many open questions, such as the best way of representing the molecular structure, incorporating physical properties, exploring the chemical space of molecules, integrating long-range interactions, or predicting non-equilibrium molecule states and transitions.

**Contact:** Johannes Klicpera

**References:**

- Directional Message Passing for Molecular Graphs
- SchNet: A continuous-filter convolutional neural network for modeling quantum interactions
- Neural message passing for quantum chemistry
- Cormorant: Covariant Molecular Neural Networks

### Robustness Verification for Deep Classifiers

**Type:** Master's thesis / Guided research

**Prerequisites:**

- Strong machine learning knowledge (at least equivalent to IN2064 plus an advanced course on deep learning)
- Strong background in mathematical optimization (preferably combined with Machine Learning setting)
- Proficiency with python and deep learning frameworks (Tensorflow or Pytorch)
- (Preferred) Knowledge of training techniques to obtain classifiers that are robust against small perturbations in data

**Description**: Recent work shows that deep classifiers suffer under presence of adversarial examples: misclassified points that are very close to the training samples or even visually indistinguishable from them. This undesired behaviour constraints possibilities of deployment in safety critical scenarios for promising classification methods based on neural nets. Therefore, new training methods should be proposed that promote (or preferably ensure) robust behaviour of the classifier around training samples.

**Contact: **Aleksei Kuvshinov

**References (Background):**

**References:**

- Certified Adversarial Robustness via Randomized Smoothing
- Formal guarantees on the robustness of a classifier against adversarial manipulation
- Towards deep learning models resistant to adversarial attacks
- Provable defenses against adversarial examples via the convex outer adversarial polytope
- Certified defenses against adversarial examples
- Lipschitz-margin training: Scalable certification of perturbation invariance for deep neural networks
- Provable robustness of relu networks via maximization of linear regions

### Neural Density Estimation

**Type: **Master's thesis / guided research

**Prerequisites: **

- Strong machine learning knowledge
- Proficiency with Python and deep learning frameworks (e.g. PyTorch)

**Description:**

If you want to generate good looking images, you should probably use Generative Adversarial Networks. What if you, however, want to check the probability of a certain image under your model? This might be imposible with GANs. What we want is to model arbitrary distributions such that we can sample from them and evaluate probablities. We can do both with Normalizing Flows. They work by taking a sample from some simple distribution (e.g. normal) and trasform it with some non-linear function (think of neural networks), thus giving us a sample from a more complex distribution. This is similar to GANs but there is one key difference: The transformation is invertible! This means that we can train a model by maximizing likelihood and draw samples with ease. The only catch is that these models are still not as powerful, fast or scalable as GANs. We aim to solve some of these issues in this project.

**Contact: **Marin Bilos

**References:**

### Uncertainty Estimation in Deep Learning

**Type:** Master's Thesis / Guided Research

**Prerequisites: **** **

- Strong knowledge in machine learning
- Strong knowledge in probability
- Good programming skills

**Description:**

Safe prediction is a key feature in many intelligent systems. Classically, Machine Learning models compute output predictions regardless of the underlying uncertainty of the encountered situations. In contrast, aleatoric and epistemic uncertainty bring knowledge about undecidable and uncommon situations. The uncertainty view can be a substantial help to detect and explain unsafe predictions, and therefore make ML systems more robust. The goal of this project is to improve the uncertainty estimation in ML models in various types of task.

**Contact: **Bertrand Charpentier, Daniel Zuegner

**References:**

### Hierarchies in Deep Learning

**Type:** Master's thesis / guided research

**Prerequisites:**

• Strong machine learning knowledge

• Good programming skills

**Description:**

Multi-scale structures are ubiquitous in real life datasets. As an example, phylogenetic nomenclature naturally reveals a hierarchical classification of species based on their historical evolutions. Learning multi-scale structures can help to exhibit natural and meaningful organizations in the data and also to obtain compact data representation. The goal of this project is to leverage multi-scale structures to improve speed, performances and understanding of Deep Learning models.

**Contact: **Bertrand Charpentier, Daniel Zuegner

**References:**

- Tree Sampling Divergence: An Information-Theoretic Metricfor Hierarchical Graph Clustering
- Hierarchical Graph Representation Learning with Differentiable Pooling
- Gradient-based Hierarchical Clustering
- Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space