About Me

Hi there! I’m Lukas, a cs master’s student at the University of Innsbruck. Striving to become an AI/ML researcher.

Latest Publications

Improving LLM Confidence with Step-by-Step Reasoning

In a recent paper, “Step-wise Decomposition Improves Calibration for Answering Multi-Hop Questions”, we explore a subtle but important problem in large language models: they’re often way too confident. Even when the answer is wrong. This post walks through the core idea behind the paper, why calibration matters, and how a simple change to prompting (breaking reasoning into steps) can significantly improve how much we can trust a model’s stated confidence. ...

Continual Learning in NLP: Tackling the Challenge of Catastrophic Forgetting

Introduction Machine learning models, particularly in Natural Language Processing (NLP), are becoming increasingly powerful. Yet, they suffer from a critical limitation: they forget. When trained on new tasks or domains, models often lose their ability to perform previously learned tasks—a phenomenon known as catastrophic forgetting. This problem becomes more pressing as NLP systems are expected to evolve alongside the ever-changing nature of human language. In my recent bachelor’s thesis, I explored how to mitigate catastrophic forgetting in NLP through continual learning. The goal? To enable lifelong learning models that can adapt to new information while retaining past knowledge. This post summarizes the key insights and contributions of my research, which formed the basis of my bachelor thesis. ...

Latest Machine Learning Articles

Building AI/ML concepts from scratch to build solid fundamentals and learn something new.

Adam Optimizer

When training deep neural networks, choosing the right optimizer can make the difference between fast, stable convergence and hours of frustration. One of the most widely used algorithms is Adam (Adaptive Moment Estimation), introduced by Diederik P. Kingma and Jimmy Ba in 2014. Adam has become the default optimizer in many frameworks (PyTorch, TensorFlow, JAX) and is still at the heart of cutting-edge models like transformers. The Idea Behind Adam Adam combines two key ideas from earlier optimizers: ...

LoRA from Scratch

LoRA (Low-Rank Adaptation) LoRA, short for Low-Rank Adaptation, is one of the most popular parameter-efficient fine-tuning (PEFT) methods. It was first proposed by Hu et al., 2021, and has become a go-to technique when adapting large pretrained models to new tasks. Why do we need PEFT methods in the first place? Finetuning large language models in the traditional way—updating all of their billions of parameters—is simply too expensive in terms of compute, memory, and storage. Researchers realized that we don’t actually need to change every parameter of a pretrained model to make it useful for new tasks. ...

Neural Network

Build a Neural Network from Scratch In this post, we’ll walk through how to build a simple neural network from scratch using just NumPy. No high-level libraries like TensorFlow or PyTorch, just the fundamentals. What is a Neural Network? A neural network is a set of interconnected layers of simple computational units called neurons. Each neuron receives inputs and returns an output value. It does this by multiplying each input by a learned weight and adding adds a bias term. The resulting value is then passed through a nonlinear activation function. We’ll explain later why this last step is necessary. ...

Precision Recall and other Classification Metrics

When evaluating a classification model accuracy alone isn’t enough. To better understand how well your model is performing, we need to dig deeper by understanding metrics like precision, recall, F1 score, and performance curves like ROC and Precision-Recall (PR). We’ll start by using the same classifier as in the Logistic Regression post. from sklearn import datasets from sklearn.linear_model import LogisticRegression import numpy as np import matplotlib.pyplot as plt iris = datasets.load_iris() X = iris["data"][:,1].reshape(-1,1) y = (iris["target"] == 0).astype(int) log_reg = LogisticRegression() log_reg.fit(X,y) The Confusion Matrix Everything starts with the confusion matrix, which keeps track of four outcomes in binary classification: ...

View all machine learning articles →

Latest Projects

Terminal GPT

Terminal GPT From my usage of ChatGPT while working in the terminal I noticed a couple of things. Normally I have a simple question, need to copy part of the output, and go back to the terminal. For a keyboard-focused workflow, switching to the browser every few minutes gets tedious. So I set out to build something you can use without leaving the shell. Why I built it I like to edit the conversation history. If the model makes a small mistake in an earlier reply, I want to correct that message so follow-ups don’t keep repeating the same error. I also live in Vim, so I tried to port as many Vim-like keybindings into the app as possible. ...

Consession

I love working in the terminal, there’s nothing quite like it. Once you’ve set up your workflow, navigating your system becomes second nature. Flying through the filesystem, quickly editing files, batch-moving specific file types, writing quick bash scripts, or chaining commands together with pipes — it’s all possible without ever lifting your hands from the keyboard. Learning to use the terminal effectively might be the highest ROI you can make as a developer. Every new tool you master becomes a force multiplier, letting you work faster and more efficiently. ...

Spinning Cube: A Pure C Experiment in Graphics and Algorithms

Introduction Programming is all about problem-solving, and sometimes the best way to improve is by challenging yourself. In my latest side project, I set out to write a 3D spinning cube entirely in C—without looking anything up. No tutorials, no external help—just raw problem-solving and deep diving into low-level programming. This project, hosted on GitHub (itsfernn/spinning-cube), was an opportunity to practice my C skills, experiment with complex linear algebra, and implement convex hull algorithms in a self-contained, pure C environment. ...

Alpenwort: An Interactive Website to Explore 150 Years of Alpine History

Introduction As part of a university project, I was asked to create a website for the Tyrolean Alpine Journal. The project required a variety of skills I have developed throughout my education so far. The original dataset was quite noisy, with errors introduced by OCR (image-to-text scanning), which made processing more challenging. Project Goals The main objectives of the project were: Extract location data from plain text Map that data to candidate geographic positions Use LLMs to resolve ambiguities (for example when multiple locations share the same name) Build an interactive website to explore the results Data Extraction and Processing For extracting locations, I used the well-established named entity recognition (NER) tool spaCy. From the extracted entities, I created a dataset and mapped each entry to candidate locations using the GeoNames API. ...

View all projects →