Diploma Thesis Projects 2025-2026
-
JVM: Static Analysis
In this thesis, the student will implement a static analysis for Java.
The analysis will extend an existing pointer analysis to try and infer
object lifetimes, using a type-and-effect system. You will learn about
type systems, type-and-effect systems, and static analysis.
-
JVM: Dynamic Analysis
Implement a dynamic analysis in the OpenJDK JVM. The analysis will
extend the C1 and C2 JIT compilers inside the JVM to produce instrumented
code that tracks read and write accesses, producing a trace. Another tool
can then read and analyze the trace to produce statistics about memory
accesses and object lifetimes.
-
JVM: Add support for secret bytecode
The goal of this project is to change the JVM JIT compiler to
keep the bytecode secret from external attackers for as long as
possible during the program execution. Work involves
understanding and modifying the deoptimization path and stack
rewriting part of OpenJDK.
-
JVM: OpenJDK memory performance profiling
Some JVM garbage collectors occasionaly pause program execution
to perform GC tasks atomically. The goal of this project is to
augment the JVM with performance counter tracing (using the Perf
API) and produce detailed traces of hardware counters for memory
events (cache misses, hits, etc.). The traces should be able to
separately profile the application memory performance and the GC
memory performance.
-
JVM: Auto-hint generation in TeraHeap
TeraHeap is an extension of the OpenJDK garbage collectors that grows
the Java Heap over NVMe drives. In this thesis, you will replicate the
evaluation from the paper Blaze:
Holistic Caching for Iterative Data Processing and apply
their algorithm for automatic cache selection on TeraHeap.
-
JVM: Learn the ElasticSearch code and port to TeraHeap
TeraHeap is an extension of OpenJDK that can place a very large
(terabyte) Java Heap on SSD and NVMe devices.
The goal of this project is to port the ElasticSearch engine to
TeraHeap, so that it is able to store objects in the heap and
avoid serialization onto disk files.
-
JVM: Learn the Flink code and port to TeraHeap
TeraHeap is an extension of OpenJDK that can place a very large
(terabyte) Java Heap on SSD and NVMe devices.
The goal of this project is to port the Flink Data Analytics framework to
TeraHeap, so that it is able to store objects in the heap and avoid
serialization onto disk files.
-
DevOps: Build Continuous Integration and Testing
The goal of this project is to develop continuous integration for
code repositories for various versions of OpenJDK and TeraHeap, using a
set of existing container images. Work will include test development,
automated testing, automated building, testing of multiple
configurations, and data analysis of results.
-
Supercomputing: Understanding distributed AI applications
Context : Nowadays, Large Language Models such as LLAMA or ChatGPT are so
huge that they require thousands of machines to perform learning and
inference. In both cases, they exchange very large amounts of data, which
means effective communication between them is very important. Recently,
Chakra ET traces have been proposed to model distributed AI applications
and develop better communication schemes both at network and application
level.
Description: The student will familiarize with open-source Chakra ET
traces and tools. Then, he will capture the traces of a small AI
application and provide analysis for those traces. Finally, he will
develop a parser to exploit those traces with a network simulator that we
have created in the lab.
Technical skills: C++ and python; Computer networks
-
Data Analysis: Assess the MPI Simulator Accuracy
The goal of this work is to analyze trace and simulation data,
compare with actual supercomputer runs and other simulated
results, and estimate the accuracy of the simulator model.
It will require some C++, but most of the work will be analyzing
data (this includes doing some statistics) using Python and
Jupyter notebooks, and investigating where we see deviations.
-
Compilers: Sign Extension Optimizations
Currently LLVM applies ad-hoc techniques for sign extension eliminations.
We can eliminate a sign extension if the upper bits are redudant to the
rest of the program. The goal of this work is to apply the globally
optimal algorithm by Kawahito et al., for sign extension elimination to
the LLVM Intermediate Representation format.
-
Compilers: Vector Optimizations
Currently LLVM uses standard loop transformation techniques to detect
opportunities to use SIMD vector operations. Some RISC-V processors
offer very large vector registers, designed to optimize AI computations.
The goal of this project is to integrate polyhedral loop transformations
with the LLVM vector optimizations and optimize AI kernels.
-
Compilers: Dynamic Analysis in Go
Implement a dynamic analysis in the Go Runtime System. The analysis
will extend the Runtime to
produce instrumented code that tracks read and write accesses,
producing a trace. Another tool can then read and analyze the
trace to produce statistics about memory access, or detect data
races or deadlocks in parallel code.
-
Supercomputing: Algorithms for fast photon distribution fitting
Context: There is a variety of situations where scientists need to
analyse noisy images in real time. For instance, to track orbiting
satellites to support laser communication; or to control experiments that
challenge the rules of quantum physics. In both cases, they need to fit
images with an expected photon distribution with good accuracy as
effectively as possible.
Description: The student will survey algorithms for distribution fitting
and estimate their complexity. Then, he will either adapt one of those
algorithms or propose a new solution that best meets the requirements of
our experiments, striving to leverage the SIMD instructions present on
modern CPUs.
Technical skills: Python, C; Algorithm complexity
Diploma Thesis Projects 2024-2025
-
JVM: Static Analysis
Implement a static analysis for Java. The analysis will use an existing
framework to implement pointer and dataflow analysis and detect
thread-local references in bytecode.
-
JVM: Multi-tenant JVM orchestrator
Design and implement an orchestrator service to manage resources
(e.g., DRAM) among multiple JVM processes or containers running
on a single machine. The goal of this project is to integrate
existing support for dynamic JVM Heap resizing in TeraHeap, to
give DRAM to the JVM processes that require more heap at every
given point in time.
-
MPI Simulator: Efficient Trace Generation
Supercomputing applications are often simulated before running
on actual supercomputers with millions of CPUs. The goal of
this work is to augment an existing state-of-the-art MPI
simulator with support for generating traces fast, which can
then be simulated on different supercomputer configurations.
Skills involved include C++, MPI, some statistics, possibly Python.
-
MPI Simulator: Topology-aware Job Mapping
In practice, the resources of a supercomputer are managed by a
central system that allocates slices of it to different
applications. It is best that different applications do not
interfere with each other on the CPU or network links.
To do this, the algorithm that allocates resources for each job
must know how the compute nodes are connected with each other
(topology), and select slices that are convex (i.e., messages do
not use network resources outside of that slice).
This project will start with the fat-tree topology (where an
approximate solution should be fairly straight-forward), and
then move on to more demanding topologies (torus and dragonfly).
It involves work on the algorithmic aspects and a C++
implementation.
-
MPI Simulator: Algorithm and modeling of collective communication primitives
Context: With the rise of AI, the performance requirement of datacentres
increase quickly, and we see a convergence with High Performance
Computers (HPC) (as known as supercomputers). In particular, the
technology used by machines to communicate are now very similar in both
HPC and AI. In HPC, the Message Passing Interface run-time has been the
de-facto standard for distributed application, and recently inspired most
of the distributed runtimes in the AI space as well. In particular, all
those applications heavily rely on so-called collective communication
primitives (e.g. all-to-all exchange of message or data reduction); where
a group of processes distributed over multiple computers exchange data.
But those collective primitives themselves introduce a lot of complexity.
Description: The student will first familiarize with the MPI runtime, and
in particular its collective primitives. He will then survey the
different algorithms used in state-of-the-art MPI implementations;
striving to present them in an unified description framework. Finally, he
will integrate a subset of those algorithms into a simulator developed in
our team.
Technical skills: C and/or C++ (the student will need to understand production source code); Fluency with algorithms.
-
Supercomputing: Characterization of state-of-the-art High Performance interconnects
Context: With the rise of AI, the performance requirement of datacentres
increase quickly, and we see a convergence with High Performance
Computers (HPC) (as known as supercomputers). In particular, the
technology used by machines to communicate are now very similar in both
HPC and AI. In order to reach required performance levels, those
interconnects and their middleware are quite complex, and achieving the
best performance for both HPC and AI applications requires a good
knowledge of those elements.
Description: The student will first familiarize with the software
ecosystem used in HPC clusters (batch systems, MPI runtime). Then, we
will benchmark the interconnect technology available in various HPC
centers, using a set of both state-of-the-art and in-house tools, to
identify both intrinsic characteristics (e.g., latency, throughput) and
performance for typical communication patterns. Eventually, those
insights will be used to calibrate a network model developed in our team.
Technical skills: C or C++; Python; Basic data analysis (average,
standard deviation, linear regressions); Basic data visualisation
(scatter plots, histograms).
-
Supercomputing: Web-based dynamic data visualization
Context: High Performance networks are very complex beasts, and
understanding them in details is hard; especially when many
communications happen at the same time. Usually, administrators rely on a
set of diagnostic tools to detect errors and saturation issues. However,
there are cases where they could use more interactive data visualization,
for instance when they set up a new interconnect or when they try to
identify soft issues (e.g. congestion or performance degradation). In our
lab, we have a tool that simulates these networks, and we are able to
generate snapshots of their state; which we can then analyse
interactively in a web browser using Javascript-based pages; but we miss
the ability to view the evolution of the network over time.
Description: The student will familiarize with common Javascript
libraries (e.g. Node.js and D3.js) and the React framework (e.g. with
Next.js). Then, he will implement a web application that streams
simulated data from the server and present it dynamically to the client.
In time, he will add more interactive functionalities for utilisation by
the user, such as: (1) selection of a subset of simulated nodes (2)
selection of a subset of simulated communication flows (3) various view
layers (contention level, Bit Error Rate) (4) grouping of simulated nodes
by simulate application and/or rack.
Technical skills: Web technologies (Javascript, REACT, HTML, DOM, CSS)
-
Supercomputing: Dynamic 3D data visualization
Context: High Performance networks are very complex beasts, and
understanding them in details is hard; especially when many
communications happen at the same time. One way to tackle this issue is
to leverage 3D graphics to get a qualitative understanding of the system.
Currently, we have developed a graphical interface that uses Qt and the
VTK library to display a real-time-ish view of a network (currently only
2D-Mesh topology).
The student will familiarize with VTK and Qt with simple examples. Then
he will improve the existing visualization tool. Depending on time and
skill set, he may work on (1) adding support to additional simulated
network topologies; (2) add representation for link usage and node/link
failures; (3) simplified computer rack representation; (4) webGL or
webGPU back-end for rendering
Technical skills : C++; 3D graphics fundamentals; Qt basics
Diploma Thesis Projects 2023-2024
-
JVM: Dynamic Race Detection
Implement a dynamic analysis in the OpenJDK JVM. The analysis
will extend the C1 and C2 JIT compilers inside the JVM to
produce instrumented code that tracks read and write accesses,
producing a trace. Another tool can then read and analyze the
trace to detect data races or deadlocks in parallel code.
-
JVM: Port the Lucene search engine to TeraCache
TeraCache is an extension of OpenJDK that can place a very large
Java Heap on SSD and NVMe devices.
The goal of this project is to port the Lucene search engine to
TeraCache, so that it is able to store objects in the heap and
avoid serialization onto disk files.
-
Graph Analytics: Misleading Repurposing Detection
Misleading repurposing occurs when a malicious user changes the
identity of their social media account for a new purpose, while
retaining their followers.
The goal of this project is to replicate work in the
corresponding
paper by Elmas and Overdorf
(in ICWSM-2023) and apply it to Greek-speaking twitter traffic.
-
Graph Analytics: Community Detection
Compare a set of community detection algorithms using Twitter
data. Correlate with real-world outcomes (election results,
market surveys, official statistics).
Diploma Thesis Projects 2022-2023
-
Compilers: LLVM T-Head Ba/Bb/Bs Support
T-Head has a RISC-V SoC with vendor-specific extensions (see
the spec).
The goal of this task is to add support for the xtheadba,
xtheadbb and xtheadbs extensions to LLVM and evaluate the
speedup. Note that these extensions are similar to the RISC-V
zb* (zba, zbb, zbc, zbs) extensions, which are already supported
in LLVM. Besides implementing the instructions and extending the
cost model, a proper integration is required in order to let the
compiler optimize provided code and emit these instructions.
Adding new test cases to LLVM's test infrastructure ensures
future changes won't break the functionality. The evaluation of
the speedup can be achieved by comparing the dynamic instruction
count of SPEC CPU 2017 benchmarks using QEMU.
-
Compilers: LLVM T-Head CondMov, Mac Support
T-Head has a RISC-V SoC with vendor-specific extensions (see
the spec).
The goal of this task is to add support for the xtheadcondmov
and xtheadmac extensions to LLVM and evaluate the speedup. Note,
that these extensions are similar to existing instructions in
other architectures (Mips has similar conditional move
instructions, AArch64 as similar multiply-accumulate
instructions), which are already supported in LLVM. Besides
implementing the instructions and extending the cost model, a
proper integration is required in order to let the compiler
optimize provided code and emit these instructions. Adding new
test cases to LLVM's test infrastructure ensures future changes
won't break the functionality. The evaluation of the speedup can
be achieved by comparing the dynamic instruction count of SPEC
CPU 2017 benchmarks.
-
Compilers: LLVM MemPair, MemIdx, FMemIdx Support
T-Head has a RISC-V SoC with vendor-specific extensions (see
the spec).
The goal of this task is to add support for the xtheadmempair,
xtheadmemidx and xtheadfmemidx extensions to LLVM and evaluate
the speedup. Note, that these extensions are similar to similar
instructions in AArch64, which are already supported in LLVM.
Besides implementing the instructions and extending the cost
model, a proper integration is required in order to let the
compiler optimize provided code and emit these instructions.
Adding new test cases to LLVM's test infrastructure ensures
future changes won't break the functionality. The evaluation of
the speedup can be achieved by comparing the dynamic instruction
count of SPEC CPU 2017 benchmarks.
-
JVM: Add support for secret bytecode
The JVM uses a class loader to parse class files and load
bytecode to be executed. The goal of this project is to change
the class loader to support encrypted class files, aimed to keep
the bytecode secret from external attackers. Work includes
defining the threat model and possibly working also with the JIT
compiler to minimize the attack window, i.e., the time that
unencrypted bytecode stays in memory and can be stolen via e.g.,
a core dump.
-
JVM: Java Garbage Collection
Extend OpenJDK with support for annotations that control memory
placement and lifetime of objects, and adapt the garbage
collector to migrate objects accordingly.
-
Programming Languages: Contextual effects in Rust
Extend the type system of the Rust compiler to generate
contextual effect constraints.
-
Programming Languages: Locality in the Rust Memory Allocator
Measure locality and object lifetimes in Rust programs.
-
Social Network Analytics: Link Prediction
Train a Machine Learning or Deep Learning model to predict user
interactions on social media.
Diploma Thesis Projects 2021-2022
-
Social Network Graph Analytics: Find blocking and ghosting
Description:
Design and implement an analysis to infer blocking between social
network users, using a large dataset from Twitter. Investigate
how blocks parition the graph, discover how information
diffusion is limited by users blocking or ghosting other users.
Further information: Implement an information diffusion analysis
on large multilayer graphs, using the Spark-GraphX distributed
analytics runtime system. Use special features of the twitter
API to infer when users have blocked other users. Based on the
inferred block relations, study two existing alternative
algorithms for information diffusion in social networks, and
discover how the existence of a single block edge limits
information diffusion in the full graph on average.
Related topics: Graph theory, information diffusion
(algorithms used in epidemiology), graph analytics, multilayer
graphs, statistics.
-
Social Network Graph Analytics: Predict account bans
Try to predict account bans on the YouTube or Twitter social
network graphs. This project will use graph embedding
algorithms to represent evolving graphs and perform machine
learning on the embeddings, to predict which nodes in the graph
may be deleted in the near future.
-
Programming Languages: Static Analysis in Rust
Extend the type system of the Rust compiler to generate and
solve constraints to compute a property.
-
Programming Languages: Static Race Detection Warning Ranking
Port, maintain, and evaluate the Locksmith static analyzer on a
set of C programs.
Data race detection produces a large list of warnings of
possible races. This project will try to rank these warnings by
order of importance, trying and comparing two ranking
approaches.
Diploma Thesis Projects 2020-2021
-
Learn Rust, Parse Rust, Analyze Rust
Familiarize yourself with existing open-source Rust front-ends,
compare and benchmark them and develop a simple static checker
for Rust programs in one of the existing front-ends.
-
LLVM and OpenMP tasks
Extend LLVM with OpenMP tasks, and link it with the PARTEE
task-parallel runtime system. LLVM is a production compiler
written in C++. PARTEE is an API and runtime system written in
low-level C, targeting distributed DMA-based and shared-memory
architectures.
-
Java Garbage Collection
Extend one of the OpenJDK JVM Garbage Collectors with support
for placing objects on memory pages in a programmable way, in
order to achieve better temporal and spatial locality. Then
benchmark your GC with different kinds of applications.
-
Artificial Intelligence on Social Network Graphs
Extend an existing algorithm for graph embeddings with
structural features for shapes other than triangles. Explore
applications where such features make an improvement in existing
Machine Learning or Deep Learning models, and measure the
improvement.
-
Social Network Graph Analytics: Community Detection
Description:
Analyze twitter traffic on a given topic such as Type 2
Diabetes, and map the involved community. Use community
detection analysis to identify users and groups that repeatedly
interact with the topic. Calculate usage patterns, correlation
with demographic, and geolocation data. Implement your analysis
in a scalable analytics framework (Spark or Flink) and calculate
performance and scalability.
-
Social Network Graph Analytics: Information Diffusion
Description:
Design and implement a plugin for a twitter crawler that will
discover and report in real time the diffusion graphs produced
for specific keywords, memes, or phrases.
Further information: Information diffusion is the analysis of
the propagation of information in networks. Retweets are an
exmaple of information diffusion, where an original piece of
information may be propagated to users that were not immediately
exosed to the original content in the graph, but were eventually
able to see the content because of a path of retweets from the
original poster to the user.
The aim of this project is to develop a streaming analysis on
streams of data produced by a twitter crawler and subsequently
produce a stream of evolving diffusion graphs for selected
memes, hashtags, phrases, urls, topics, etc.
Related topics: Graph theory, information diffusion
(algorithms used in epidemiology), graph analytics, multilayer
graphs, statistics.
-
Static Analysis in Infer
Description:
Extend the Infer static analysis system with contextual effects.
Related information: Infer is an open source tool mainly developed
by Facebook that can analyze programs written in C, C++,
Objective C, and Java. It models heaps using propositions in
separation logic.
Related reading:
-
Find the Infer repository in github, read its documentation,
install and compile it, and use it to analyze example
programs.
-
Read on separation logic, starting from the Wikipedia page and
the seminal paper by Reynolds.
Diploma Thesis Projects 2019-2020
-
OpenMP distributed memory allocation
Description:
Extend OpenMP with directives for memory allocation patterns
tailored to NUMA or distributed memories.
Related reading: Region-based memory management, the Legion
parallel programming language.
-
Fault-tolerant PARTEE tasks
Description:
Extend the PARTEE runtime system with support for local and
global checkpointing and recovery from both transient and
permanent errors. Measure the overhead of fault tolerance on
task-parallel programs. This project will use experimental
hardware and requires physical presense in FORTH.
-
Social Network Graph Analytics: Community Detection
Description:
Design and implement an analysis that discovers users speaking
about a specific topic in social media, and mines the
corresponding graph. Analyze the detected community with
respect to placement in the general audience.
-
Streaming Graph Analytics in Flink
Description:
Augment the Graph-Streaming library of Flink with additional
streaming algorithm implementations.
-
Graph Analytics in Flink
Description:
Reimplement an existing graph analytics computation in the
Flink distributed analytics runtime system.
Further information: Read the TwitterMancer paper and implement
feature extraction by modifying the Triangle Count algorithm
accordingly.
Diploma Thesis Projects 2018-2019
-
MPI benchmarking
Description:
Write tests and test automation, port existing benchmarks, and
evaluate MPI applications on experimental hardware platforms
with DMA-accelerated MPI.
-
MPI collectives
Description:
Understand the existing implementation of MPI collectives and
investigate alternative implementations better tailored for
DMA-accelerated communication and very low custom networks in
experimental HPC hardware.
-
Linux Kernel
Description:
Add support for kernel self-extraction during boot for the
RISC-V architecture.
-
Linux Kernel
Description:
Extend the perf tool (implemneting all required in-kernel
support) with additional counters for experimental HPC
architectures.
-
OpenMP tasks in Eclipse
Description:
Learn the Eclipse C internal representation, learn the workings of
an existing static analysis engine and interface the two.
-
Twitter Flu Trends
Calculate the phases of flu epidemics using twitter data.
Diploma Thesis Projects 2017-2018
-
Benchmarking Kernel Memory Allocators
Compare kernel allocation accross an off-the-shelf platform
based on x86 machines and an experimental ARM-based blade.
Develop benchmarks and draw conclusions.
-
Twitter Graph Analytics for Targetted Marketing
Parallelize and optimize existing algorithms for distributed
analytics. Develop a marketing application for the Twitter
graph for the Spark and Spark/GraphX analytics engines.
-
Graph Analytics for Flink Gelly
Benchmark Flink Gelly with large social network graphs.
Replicate existing anaytics pipelines for twitter in Flink.
Diploma Thesis Projects 2016-2017
-
Alternative RDD implementations for Spark
Description:
Learn to code in Scala and Spark and write one or more extensions
of Spark RDDs optimized for specific algorithms.
Diploma Thesis Projects 2013-2014
-
LLVM and tasks
Description:
Extend LLVM with a "spawn" keyword that calls a function in
parallel, as in Cilk, and link it with the PARTEE task-parallel
runtime system.
Diploma Thesis Projects 2012-2013
-
Task-parallel Fault Tolerance
Extend the BDDT runtime system with fault-tolerance. Define a
realistic fault model for permanent and transient faults on existing
multicore computers. Extend the BDDT runtime system with support for
local and global checkpointing and recovery from both transient and
permanent errors. Measure the overhead of fault tolerance on
task-parallel programs.
-
Runtime Dependencies in Recursively Parallel Programs
Implement a runtime analysis for dependencies among
recursively-parallel tasks, and extend an existing runtime system
(e.g., Cilk) with a dependency-aware scheduler.
-
Static analysis in Eclipse
Learn the architecture of the Eclipse IDE (for either Java or C
programming), including the AST and analysis frameworks, and write an
Eclipse interface for an existing static analysis engine.
Internship Topics 2012
-
A fault-tolerant task parallel runtime
Positions: 1
Lab: CARV
Description: Understand the BDDT runtime and add support
for checkpointing of computations, and restoring to an earlier
point on fault.