Diploma Thesis Projects 2025-2026

  1. JVM: Static Analysis
    In this thesis, the student will implement a static analysis for Java. The analysis will extend an existing pointer analysis to try and infer object lifetimes, using a type-and-effect system. You will learn about type systems, type-and-effect systems, and static analysis.
  2. JVM: Dynamic Analysis
    Implement a dynamic analysis in the OpenJDK JVM. The analysis will extend the C1 and C2 JIT compilers inside the JVM to produce instrumented code that tracks read and write accesses, producing a trace. Another tool can then read and analyze the trace to produce statistics about memory accesses and object lifetimes.
  3. JVM: Add support for secret bytecode
    The goal of this project is to change the JVM JIT compiler to keep the bytecode secret from external attackers for as long as possible during the program execution. Work involves understanding and modifying the deoptimization path and stack rewriting part of OpenJDK.
  4. JVM: OpenJDK memory performance profiling
    Some JVM garbage collectors occasionaly pause program execution to perform GC tasks atomically. The goal of this project is to augment the JVM with performance counter tracing (using the Perf API) and produce detailed traces of hardware counters for memory events (cache misses, hits, etc.). The traces should be able to separately profile the application memory performance and the GC memory performance.
  5. JVM: Auto-hint generation in TeraHeap
    TeraHeap is an extension of the OpenJDK garbage collectors that grows the Java Heap over NVMe drives. In this thesis, you will replicate the evaluation from the paper Blaze: Holistic Caching for Iterative Data Processing and apply their algorithm for automatic cache selection on TeraHeap.
  6. JVM: Learn the ElasticSearch code and port to TeraHeap
    TeraHeap is an extension of OpenJDK that can place a very large (terabyte) Java Heap on SSD and NVMe devices. The goal of this project is to port the ElasticSearch engine to TeraHeap, so that it is able to store objects in the heap and avoid serialization onto disk files.
  7. JVM: Learn the Flink code and port to TeraHeap
    TeraHeap is an extension of OpenJDK that can place a very large (terabyte) Java Heap on SSD and NVMe devices. The goal of this project is to port the Flink Data Analytics framework to TeraHeap, so that it is able to store objects in the heap and avoid serialization onto disk files.
  8. DevOps: Build Continuous Integration and Testing
    The goal of this project is to develop continuous integration for code repositories for various versions of OpenJDK and TeraHeap, using a set of existing container images. Work will include test development, automated testing, automated building, testing of multiple configurations, and data analysis of results.
  9. Supercomputing: Understanding distributed AI applications
    Context : Nowadays, Large Language Models such as LLAMA or ChatGPT are so huge that they require thousands of machines to perform learning and inference. In both cases, they exchange very large amounts of data, which means effective communication between them is very important. Recently, Chakra ET traces have been proposed to model distributed AI applications and develop better communication schemes both at network and application level.
    Description: The student will familiarize with open-source Chakra ET traces and tools. Then, he will capture the traces of a small AI application and provide analysis for those traces. Finally, he will develop a parser to exploit those traces with a network simulator that we have created in the lab.
    Technical skills: C++ and python; Computer networks
  10. Data Analysis: Assess the MPI Simulator Accuracy
    The goal of this work is to analyze trace and simulation data, compare with actual supercomputer runs and other simulated results, and estimate the accuracy of the simulator model. It will require some C++, but most of the work will be analyzing data (this includes doing some statistics) using Python and Jupyter notebooks, and investigating where we see deviations.
  11. Compilers: Sign Extension Optimizations
    Currently LLVM applies ad-hoc techniques for sign extension eliminations. We can eliminate a sign extension if the upper bits are redudant to the rest of the program. The goal of this work is to apply the globally optimal algorithm by Kawahito et al., for sign extension elimination to the LLVM Intermediate Representation format.
  12. Compilers: Vector Optimizations
    Currently LLVM uses standard loop transformation techniques to detect opportunities to use SIMD vector operations. Some RISC-V processors offer very large vector registers, designed to optimize AI computations. The goal of this project is to integrate polyhedral loop transformations with the LLVM vector optimizations and optimize AI kernels.
  13. Compilers: Dynamic Analysis in Go
    Implement a dynamic analysis in the Go Runtime System. The analysis will extend the Runtime to produce instrumented code that tracks read and write accesses, producing a trace. Another tool can then read and analyze the trace to produce statistics about memory access, or detect data races or deadlocks in parallel code.
  14. Supercomputing: Algorithms for fast photon distribution fitting
    Context: There is a variety of situations where scientists need to analyse noisy images in real time. For instance, to track orbiting satellites to support laser communication; or to control experiments that challenge the rules of quantum physics. In both cases, they need to fit images with an expected photon distribution with good accuracy as effectively as possible.
    Description: The student will survey algorithms for distribution fitting and estimate their complexity. Then, he will either adapt one of those algorithms or propose a new solution that best meets the requirements of our experiments, striving to leverage the SIMD instructions present on modern CPUs.
    Technical skills: Python, C; Algorithm complexity

Diploma Thesis Projects 2024-2025

  1. JVM: Static Analysis
    Implement a static analysis for Java. The analysis will use an existing framework to implement pointer and dataflow analysis and detect thread-local references in bytecode.
  2. JVM: Multi-tenant JVM orchestrator Design and implement an orchestrator service to manage resources (e.g., DRAM) among multiple JVM processes or containers running on a single machine. The goal of this project is to integrate existing support for dynamic JVM Heap resizing in TeraHeap, to give DRAM to the JVM processes that require more heap at every given point in time.
  3. MPI Simulator: Efficient Trace Generation
    Supercomputing applications are often simulated before running on actual supercomputers with millions of CPUs. The goal of this work is to augment an existing state-of-the-art MPI simulator with support for generating traces fast, which can then be simulated on different supercomputer configurations. Skills involved include C++, MPI, some statistics, possibly Python.
  4. MPI Simulator: Topology-aware Job Mapping
    In practice, the resources of a supercomputer are managed by a central system that allocates slices of it to different applications. It is best that different applications do not interfere with each other on the CPU or network links. To do this, the algorithm that allocates resources for each job must know how the compute nodes are connected with each other (topology), and select slices that are convex (i.e., messages do not use network resources outside of that slice).
    This project will start with the fat-tree topology (where an approximate solution should be fairly straight-forward), and then move on to more demanding topologies (torus and dragonfly). It involves work on the algorithmic aspects and a C++ implementation.
  5. MPI Simulator: Algorithm and modeling of collective communication primitives
    Context: With the rise of AI, the performance requirement of datacentres increase quickly, and we see a convergence with High Performance Computers (HPC) (as known as supercomputers). In particular, the technology used by machines to communicate are now very similar in both HPC and AI. In HPC, the Message Passing Interface run-time has been the de-facto standard for distributed application, and recently inspired most of the distributed runtimes in the AI space as well. In particular, all those applications heavily rely on so-called collective communication primitives (e.g. all-to-all exchange of message or data reduction); where a group of processes distributed over multiple computers exchange data. But those collective primitives themselves introduce a lot of complexity.
    Description: The student will first familiarize with the MPI runtime, and in particular its collective primitives. He will then survey the different algorithms used in state-of-the-art MPI implementations; striving to present them in an unified description framework. Finally, he will integrate a subset of those algorithms into a simulator developed in our team.
    Technical skills: C and/or C++ (the student will need to understand production source code); Fluency with algorithms.
  6. Supercomputing: Characterization of state-of-the-art High Performance interconnects
    Context: With the rise of AI, the performance requirement of datacentres increase quickly, and we see a convergence with High Performance Computers (HPC) (as known as supercomputers). In particular, the technology used by machines to communicate are now very similar in both HPC and AI. In order to reach required performance levels, those interconnects and their middleware are quite complex, and achieving the best performance for both HPC and AI applications requires a good knowledge of those elements.
    Description: The student will first familiarize with the software ecosystem used in HPC clusters (batch systems, MPI runtime). Then, we will benchmark the interconnect technology available in various HPC centers, using a set of both state-of-the-art and in-house tools, to identify both intrinsic characteristics (e.g., latency, throughput) and performance for typical communication patterns. Eventually, those insights will be used to calibrate a network model developed in our team.
    Technical skills: C or C++; Python; Basic data analysis (average, standard deviation, linear regressions); Basic data visualisation (scatter plots, histograms).
  7. Supercomputing: Web-based dynamic data visualization
    Context: High Performance networks are very complex beasts, and understanding them in details is hard; especially when many communications happen at the same time. Usually, administrators rely on a set of diagnostic tools to detect errors and saturation issues. However, there are cases where they could use more interactive data visualization, for instance when they set up a new interconnect or when they try to identify soft issues (e.g. congestion or performance degradation). In our lab, we have a tool that simulates these networks, and we are able to generate snapshots of their state; which we can then analyse interactively in a web browser using Javascript-based pages; but we miss the ability to view the evolution of the network over time.
    Description: The student will familiarize with common Javascript libraries (e.g. Node.js and D3.js) and the React framework (e.g. with Next.js). Then, he will implement a web application that streams simulated data from the server and present it dynamically to the client. In time, he will add more interactive functionalities for utilisation by the user, such as: (1) selection of a subset of simulated nodes (2) selection of a subset of simulated communication flows (3) various view layers (contention level, Bit Error Rate) (4) grouping of simulated nodes by simulate application and/or rack.
    Technical skills: Web technologies (Javascript, REACT, HTML, DOM, CSS)
  8. Supercomputing: Dynamic 3D data visualization
    Context: High Performance networks are very complex beasts, and understanding them in details is hard; especially when many communications happen at the same time. One way to tackle this issue is to leverage 3D graphics to get a qualitative understanding of the system. Currently, we have developed a graphical interface that uses Qt and the VTK library to display a real-time-ish view of a network (currently only 2D-Mesh topology).
    The student will familiarize with VTK and Qt with simple examples. Then he will improve the existing visualization tool. Depending on time and skill set, he may work on (1) adding support to additional simulated network topologies; (2) add representation for link usage and node/link failures; (3) simplified computer rack representation; (4) webGL or webGPU back-end for rendering
    Technical skills : C++; 3D graphics fundamentals; Qt basics

Diploma Thesis Projects 2023-2024

  1. JVM: Dynamic Race Detection
    Implement a dynamic analysis in the OpenJDK JVM. The analysis will extend the C1 and C2 JIT compilers inside the JVM to produce instrumented code that tracks read and write accesses, producing a trace. Another tool can then read and analyze the trace to detect data races or deadlocks in parallel code.
  2. JVM: Port the Lucene search engine to TeraCache
    TeraCache is an extension of OpenJDK that can place a very large Java Heap on SSD and NVMe devices. The goal of this project is to port the Lucene search engine to TeraCache, so that it is able to store objects in the heap and avoid serialization onto disk files.
  3. Graph Analytics: Misleading Repurposing Detection
    Misleading repurposing occurs when a malicious user changes the identity of their social media account for a new purpose, while retaining their followers. The goal of this project is to replicate work in the corresponding paper by Elmas and Overdorf (in ICWSM-2023) and apply it to Greek-speaking twitter traffic.
  4. Graph Analytics: Community Detection
    Compare a set of community detection algorithms using Twitter data. Correlate with real-world outcomes (election results, market surveys, official statistics).

Diploma Thesis Projects 2022-2023

  1. Compilers: LLVM T-Head Ba/Bb/Bs Support
    T-Head has a RISC-V SoC with vendor-specific extensions (see the spec). The goal of this task is to add support for the xtheadba, xtheadbb and xtheadbs extensions to LLVM and evaluate the speedup. Note that these extensions are similar to the RISC-V zb* (zba, zbb, zbc, zbs) extensions, which are already supported in LLVM. Besides implementing the instructions and extending the cost model, a proper integration is required in order to let the compiler optimize provided code and emit these instructions. Adding new test cases to LLVM's test infrastructure ensures future changes won't break the functionality. The evaluation of the speedup can be achieved by comparing the dynamic instruction count of SPEC CPU 2017 benchmarks using QEMU.
  2. Compilers: LLVM T-Head CondMov, Mac Support
    T-Head has a RISC-V SoC with vendor-specific extensions (see the spec). The goal of this task is to add support for the xtheadcondmov and xtheadmac extensions to LLVM and evaluate the speedup. Note, that these extensions are similar to existing instructions in other architectures (Mips has similar conditional move instructions, AArch64 as similar multiply-accumulate instructions), which are already supported in LLVM. Besides implementing the instructions and extending the cost model, a proper integration is required in order to let the compiler optimize provided code and emit these instructions. Adding new test cases to LLVM's test infrastructure ensures future changes won't break the functionality. The evaluation of the speedup can be achieved by comparing the dynamic instruction count of SPEC CPU 2017 benchmarks.
  3. Compilers: LLVM MemPair, MemIdx, FMemIdx Support
    T-Head has a RISC-V SoC with vendor-specific extensions (see the spec). The goal of this task is to add support for the xtheadmempair, xtheadmemidx and xtheadfmemidx extensions to LLVM and evaluate the speedup. Note, that these extensions are similar to similar instructions in AArch64, which are already supported in LLVM. Besides implementing the instructions and extending the cost model, a proper integration is required in order to let the compiler optimize provided code and emit these instructions. Adding new test cases to LLVM's test infrastructure ensures future changes won't break the functionality. The evaluation of the speedup can be achieved by comparing the dynamic instruction count of SPEC CPU 2017 benchmarks.
  4. JVM: Add support for secret bytecode
    The JVM uses a class loader to parse class files and load bytecode to be executed. The goal of this project is to change the class loader to support encrypted class files, aimed to keep the bytecode secret from external attackers. Work includes defining the threat model and possibly working also with the JIT compiler to minimize the attack window, i.e., the time that unencrypted bytecode stays in memory and can be stolen via e.g., a core dump.
  5. JVM: Java Garbage Collection
    Extend OpenJDK with support for annotations that control memory placement and lifetime of objects, and adapt the garbage collector to migrate objects accordingly.
  6. Programming Languages: Contextual effects in Rust
    Extend the type system of the Rust compiler to generate contextual effect constraints.
  7. Programming Languages: Locality in the Rust Memory Allocator
    Measure locality and object lifetimes in Rust programs.
  8. Social Network Analytics: Link Prediction
    Train a Machine Learning or Deep Learning model to predict user interactions on social media.

Diploma Thesis Projects 2021-2022

  1. Social Network Graph Analytics: Find blocking and ghosting
    Description: Design and implement an analysis to infer blocking between social network users, using a large dataset from Twitter. Investigate how blocks parition the graph, discover how information diffusion is limited by users blocking or ghosting other users.
    Further information: Implement an information diffusion analysis on large multilayer graphs, using the Spark-GraphX distributed analytics runtime system. Use special features of the twitter API to infer when users have blocked other users. Based on the inferred block relations, study two existing alternative algorithms for information diffusion in social networks, and discover how the existence of a single block edge limits information diffusion in the full graph on average.
    Related topics: Graph theory, information diffusion (algorithms used in epidemiology), graph analytics, multilayer graphs, statistics.
  2. Social Network Graph Analytics: Predict account bans
    Try to predict account bans on the YouTube or Twitter social network graphs. This project will use graph embedding algorithms to represent evolving graphs and perform machine learning on the embeddings, to predict which nodes in the graph may be deleted in the near future.
  3. Programming Languages: Static Analysis in Rust
    Extend the type system of the Rust compiler to generate and solve constraints to compute a property.
  4. Programming Languages: Static Race Detection Warning Ranking
    Port, maintain, and evaluate the Locksmith static analyzer on a set of C programs. Data race detection produces a large list of warnings of possible races. This project will try to rank these warnings by order of importance, trying and comparing two ranking approaches.

Diploma Thesis Projects 2020-2021

  1. Learn Rust, Parse Rust, Analyze Rust
    Familiarize yourself with existing open-source Rust front-ends, compare and benchmark them and develop a simple static checker for Rust programs in one of the existing front-ends.
  2. LLVM and OpenMP tasks
    Extend LLVM with OpenMP tasks, and link it with the PARTEE task-parallel runtime system. LLVM is a production compiler written in C++. PARTEE is an API and runtime system written in low-level C, targeting distributed DMA-based and shared-memory architectures.
  3. Java Garbage Collection
    Extend one of the OpenJDK JVM Garbage Collectors with support for placing objects on memory pages in a programmable way, in order to achieve better temporal and spatial locality. Then benchmark your GC with different kinds of applications.
  4. Artificial Intelligence on Social Network Graphs
    Extend an existing algorithm for graph embeddings with structural features for shapes other than triangles. Explore applications where such features make an improvement in existing Machine Learning or Deep Learning models, and measure the improvement.
  5. Social Network Graph Analytics: Community Detection
    Description: Analyze twitter traffic on a given topic such as Type 2 Diabetes, and map the involved community. Use community detection analysis to identify users and groups that repeatedly interact with the topic. Calculate usage patterns, correlation with demographic, and geolocation data. Implement your analysis in a scalable analytics framework (Spark or Flink) and calculate performance and scalability.
  6. Social Network Graph Analytics: Information Diffusion
    Description: Design and implement a plugin for a twitter crawler that will discover and report in real time the diffusion graphs produced for specific keywords, memes, or phrases.
    Further information: Information diffusion is the analysis of the propagation of information in networks. Retweets are an exmaple of information diffusion, where an original piece of information may be propagated to users that were not immediately exosed to the original content in the graph, but were eventually able to see the content because of a path of retweets from the original poster to the user.
    The aim of this project is to develop a streaming analysis on streams of data produced by a twitter crawler and subsequently produce a stream of evolving diffusion graphs for selected memes, hashtags, phrases, urls, topics, etc.
    Related topics: Graph theory, information diffusion (algorithms used in epidemiology), graph analytics, multilayer graphs, statistics.
  7. Static Analysis in Infer
    Description: Extend the Infer static analysis system with contextual effects.
    Related information: Infer is an open source tool mainly developed by Facebook that can analyze programs written in C, C++, Objective C, and Java. It models heaps using propositions in separation logic.
    Related reading:
    • Find the Infer repository in github, read its documentation, install and compile it, and use it to analyze example programs.
    • Read on separation logic, starting from the Wikipedia page and the seminal paper by Reynolds.

Diploma Thesis Projects 2019-2020

  1. OpenMP distributed memory allocation
    Description: Extend OpenMP with directives for memory allocation patterns tailored to NUMA or distributed memories.
    Related reading: Region-based memory management, the Legion parallel programming language.
  2. Fault-tolerant PARTEE tasks
    Description: Extend the PARTEE runtime system with support for local and global checkpointing and recovery from both transient and permanent errors. Measure the overhead of fault tolerance on task-parallel programs. This project will use experimental hardware and requires physical presense in FORTH.
  3. Social Network Graph Analytics: Community Detection
    Description: Design and implement an analysis that discovers users speaking about a specific topic in social media, and mines the corresponding graph. Analyze the detected community with respect to placement in the general audience.
  4. Streaming Graph Analytics in Flink
    Description: Augment the Graph-Streaming library of Flink with additional streaming algorithm implementations.
  5. Graph Analytics in Flink
    Description: Reimplement an existing graph analytics computation in the Flink distributed analytics runtime system.
    Further information: Read the TwitterMancer paper and implement feature extraction by modifying the Triangle Count algorithm accordingly.

Diploma Thesis Projects 2018-2019

  1. MPI benchmarking
    Description: Write tests and test automation, port existing benchmarks, and evaluate MPI applications on experimental hardware platforms with DMA-accelerated MPI.
  2. MPI collectives
    Description: Understand the existing implementation of MPI collectives and investigate alternative implementations better tailored for DMA-accelerated communication and very low custom networks in experimental HPC hardware.
  3. Linux Kernel
    Description: Add support for kernel self-extraction during boot for the RISC-V architecture.
  4. Linux Kernel
    Description: Extend the perf tool (implemneting all required in-kernel support) with additional counters for experimental HPC architectures.
  5. OpenMP tasks in Eclipse
    Description: Learn the Eclipse C internal representation, learn the workings of an existing static analysis engine and interface the two.
  6. Twitter Flu Trends
    Calculate the phases of flu epidemics using twitter data.

Diploma Thesis Projects 2017-2018

  1. Benchmarking Kernel Memory Allocators
    Compare kernel allocation accross an off-the-shelf platform based on x86 machines and an experimental ARM-based blade. Develop benchmarks and draw conclusions.
  2. Twitter Graph Analytics for Targetted Marketing
    Parallelize and optimize existing algorithms for distributed analytics. Develop a marketing application for the Twitter graph for the Spark and Spark/GraphX analytics engines.
  3. Graph Analytics for Flink Gelly
    Benchmark Flink Gelly with large social network graphs. Replicate existing anaytics pipelines for twitter in Flink.

Diploma Thesis Projects 2016-2017

  1. Alternative RDD implementations for Spark
    Description: Learn to code in Scala and Spark and write one or more extensions of Spark RDDs optimized for specific algorithms.

Diploma Thesis Projects 2013-2014

  1. LLVM and tasks
    Description: Extend LLVM with a "spawn" keyword that calls a function in parallel, as in Cilk, and link it with the PARTEE task-parallel runtime system.

Diploma Thesis Projects 2012-2013

  1. Task-parallel Fault Tolerance
    Extend the BDDT runtime system with fault-tolerance. Define a realistic fault model for permanent and transient faults on existing multicore computers. Extend the BDDT runtime system with support for local and global checkpointing and recovery from both transient and permanent errors. Measure the overhead of fault tolerance on task-parallel programs.
  2. Runtime Dependencies in Recursively Parallel Programs
    Implement a runtime analysis for dependencies among recursively-parallel tasks, and extend an existing runtime system (e.g., Cilk) with a dependency-aware scheduler.
  3. Static analysis in Eclipse
    Learn the architecture of the Eclipse IDE (for either Java or C programming), including the AST and analysis frameworks, and write an Eclipse interface for an existing static analysis engine.

Internship Topics 2012

  1. A fault-tolerant task parallel runtime
    Positions: 1
    Lab: CARV
    Description: Understand the BDDT runtime and add support for checkpointing of computations, and restoring to an earlier point on fault.