I am a recent PhD graduate in Mathematics from Université Sorbonne Paris Nord, having defended my thesis in November 2024. My research focuses on quantitative finance, where I aim to enhance traditional financial models by integrating stochastic analysis with machine learning and deep learning techniques.
My doctoral work encompasses a broad spectrum of research areas, including market indicator diffusion, model calibration, change of measures, and data quality enhancement. These studies have led to contributions in various financial domains, such as pricing, risk management, and stress testing.
In addition to my research, I am an adjunct lecturer at Université Paris Saclay, where I have taught Machine Learning for finance and tutored students in the Data Camp and Cutting Edge Finance Projects for the past three years.
I am passionate about the intersection of stochastic analysis and artificial intelligence, and this website showcases my research and teaching experiences.
During my PhD, I have actively engaged in a productive collaboration with the financial industry, working closely with financial institutions, banks, and public organizations. This collaboration has enabled me to deepen my practical understanding of real-world challenges in financial mathematics and contribute to solving complex problems with innovative and efficient solutions. Thanks to this particularity of my PhD program (French CIFRE PhD program), I have worked on various topics in quantitative finance, including but not limited to:
In addition to the work described above, which has contributed to addressing real-world challenges, my research has led to articles that have been submitted for publication. Below, you will find their abstracts and links to the open-access preprints.
This figure compares some of the G2++ model parameters calibrated using our deep learning approach against the actual parameters (the second row is a zoomed-in version of the graphic in the first row).
Abstract: For any financial institution it is a necessity to be able to apprehend the behavior of interest rates. Despite the use of Deep Learning that is growing very fastly, due to many reasons (expertise, ease of use, ...) classic rates models such as CIR, or the Gaussian family are still being used widely. We propose to calibrate the five parameters of the G2++ model using Neural Networks. To achieve that, we construct synthetic data sets of parameters drawn uniformly from a reference set of parameters calibrated from the market. From those parameters, we compute Zero-Coupon and Forward rates and their covariances and correlations. Our first model is a Fully Connected Neural network and uses only covariances and correlations. We show that covariances are more suited to the problem than correlations. The second model is a Convolutional Neural Network using only Zero-Coupon rates with no transformation. The methods we propose perform very quickly (less than 0.3 seconds for 2,000 calibrations) and have low errors and good fitting.
Mohamed Ben Alaya, Ahmed Kebaier, Djibril Sarr, 2021. (Preprint available here. A first version of this work was published in the 2022 annual proceedings of the SFdS, Société Française de Statistique, pages 906-911.)
This figure shows the back-testing of the model, highlighting how the simulated credit spreads (shades of blue) closely follow and encompass the actual credit spread curve (black line) over time.
Abstract: This paper introduces a novel stochastic model for credit spreads. The stochastic approach leverages the diffusion of default intensities via a CIR++ model and is formulated within a risk-neutral probability space. Our research primarily addresses two gaps in the literature. The first is the lack of credit spread models founded on a stochastic basis that enables continuous modeling, as many existing models rely on factorial assumptions. The second is the limited availability of models that directly yield a term structure of credit spreads. An intermediate result of our model is the provision of a term structure for the prices of defaultable bonds. We present the model alongside an innovative, practical, and conservative calibration approach that minimizes the error between historical and theoretical volatilities of default intensities. We demonstrate the robustness of both the model and its calibration process by comparing its behavior to historical credit spread values. Our findings indicate that the model not only produces realistic credit spread term structure curves but also exhibits consistent diffusion over time. Additionally, the model accurately fits the initial term structure of implied survival probabilities and provides an analytical expression for the credit spread of any given maturity at any future time.
Mohamed Ben Alaya, Ahmed Kebaier, Djibril Sarr, 2024. (Preprint available here.)
This figure shows how the expectation of the Real-World model (black line) closely matches any given credit spread (red line). The blue lines represent the 10% and 90% quantiles of the simulated credit spreads.
Abstract: This research presents a comprehensive framework for transitioning financial diffusion models from the risk-neutral (RN) measure to the real-world (RW) measure, leveraging results from probability theory, specifically Girsanov's theorem. The RN measure, fundamental in derivative pricing, is contrasted with the RW measure, which incorporates risk premiums and better reflects actual market behavior and investor preferences, making it crucial for risk management. We address the challenges of incorporating real-world dynamics into financial models, such as accounting for market premiums, producing realistic term structures of market indicators, and fitting any arbitrarily given market curve. Our framework is designed to be general, applicable to a variety of diffusion models, including those with non-additive noise such as the CIR++ model. Through case studies involving Goldman Sachs' 2024 global credit outlook forecasts and the European Banking Authority (EBA) 2023 stress tests, we validate the robustness, practical relevance and applicability of our methodology. This work contributes to the literature by providing a versatile tool for better risk measures and enhancing the realism of financial models under the RW measure. Our model's versatility extends to stress testing and scenario analysis, providing practitioners with a powerful tool to evaluate various what-if scenarios and make well-informed decisions, particularly in pricing and risk management strategies.
Mohamed Ben Alaya, Ahmed Kebaier, Djibril Sarr, 2024. (Preprint available here.)
This figure shows a step in our typo detection algorithm. High Damerau-Levenshtein scores (black line) indicate word variations; drops below the threshold (red line) suggest different words.
Abstract: In the era of big data, ensuring the quality of datasets has become increasingly crucial across various domains. We propose a comprehensive framework designed to automatically assess and rectify data quality issues in any given dataset, regardless of its specific content, focusing on both textual and numerical data. Our primary objective is to address three fundamental types of defects: absence, redundancy, and incoherence. At the heart of our approach lies a rigorous demand for both explainability and interpretability, ensuring that the rationale behind the identification and correction of data anomalies is transparent and understandable. To achieve this, we adopt a hybrid approach that integrates statistical methods with machine learning algorithms. Indeed, by leveraging statistical techniques alongside machine learning, we strike a balance between accuracy and explainability, enabling users to trust and comprehend the assessment process. Acknowledging the challenges associated with automating the data quality assessment process, particularly in terms of time efficiency and accuracy, we adopt a pragmatic strategy, employing resource-intensive algorithms only when necessary, while favoring simpler, more efficient solutions whenever possible. Through a practical analysis conducted on a publicly provided dataset, we illustrate the challenges that arise when trying to enhance data quality while keeping explainability. We demonstrate the effectiveness of our approach in detecting and rectifying missing values, duplicates and typographical errors as well as the challenges remaining to be addressed to achieve similar accuracy on statistical outliers and logic errors under the constraints set in our work.
Djibril Sarr, 2024. (Preprint available here. This work is substantially founded on the work presented at the 2021 Mathematics and Industry Challenge of AMIES, Agence pour les Mathématiques en Interaction avec l’Entreprise et la Société. This work was awarded the first prize in the data quality competition.)
Institution: Master 2 Quantitative Finance, Université Paris Saclay (Since 2022-2023)
In this course, I introduce students to the field of machine learning (ML) and deep learning (DL) applied to finance. The curriculum begins by reinforcing fundamental ML/DL concepts such as logistic regressions and feedforward neural networks. The lectures then proceed with practical applications aimed at solving real-world problems, using Python and R. Key areas include:
Students leave the course equipped with essential knowledge and practical experience to apply ML and DL tools in various sectors of finance, most importantly, with knowledge of how to generalize the techniques seen in class to the problems they will encounter in their careers.
Institution: Master 2 Quantitative Finance, Université Paris Saclay (Since 2022-2023)
This project-based course challenges students to solve real-world quantitative finance problems using advanced methodologies. As a tutor, I guide students through complex issues, providing guidance throughout the second semester.
Students develop problem-solving skills, teamwork abilities, and gain exposure to innovative finance techniques used in professional settings.
Institution: Master 2 Data Science, Université Paris Saclay (2021-2022)
As a tutor, I supervised and mentored students participating in global Kaggle competitions. The objective was to expose students to competitive data science projects, emphasizing critical thinking, practical implementation, and teamwork. Key projects included:
If you would like to reach me, feel free to send an email to: