Selected projects
Cloud (AWS)
- Building one platform on top of multiple AWS accounts with strong Identity and Access Managment (AWS Organizations (opens new window), SSO (opens new window), IAM (opens new window), Cost Explorer (opens new window)),
- Software Management (code repositories in Codecommit (opens new window), Python packages published in Codeartifact (opens new window), docker images stored in ECR (opens new window), secrets in Secrets Manager (opens new window)),
- Storage and DBs: S3 (opens new window), Athena (opens new window) (big data) and Glue (opens new window) (ETL), DynamoDB (opens new window) (NoSQL), RDS (opens new window) (PostGres),
- Virtual machines: EC2 (opens new window) and VPC (opens new window),
- Backends and batch jobs built using: SAM (opens new window) (Serverless Application Model), Lambda (opens new window), Step functions (opens new window), API Gateway (opens new window), App Runner (opens new window), ECS (opens new window), SQS (opens new window) (queuing), SNS (opens new window) (notifications),
- Real time API built using AppSync (opens new window) (GraphQL and PubSub).
Big Data
Using Spark (opens new window) in Scala (opens new window) to run distributed ETL jobs on on-premise Hadoop (opens new window) clusters. Data stored in HDFS (opens new window), using the distributed SQL query engine Impala (opens new window), and the Kudu (opens new window) datastore.
Data Engineering
Implemented complex pipelines in order to produce all dashboards for the market compliance officers (volume and open interest market shares, on all cleared instruments, at day and intra-day scales).
Orchestration using Apache Airflow (opens new window). Data collected from multiple sources and systems (Reuters, Mongodb (opens new window), ElasticSearch (opens new window), Web services, emails, web scrapping). Data stored in mongodb and SQL Server. Rest API using Flask (Python). Kerberos authentification. Power BI for visualization. Kafka for message queuing.
Predictive analytics
Generic pipeline in Scitkit-learn (opens new window) for time series. Features generated out of sliding time windows. Using kfold and cross_val for comparing models through robust model scores. GridSearchCV for hyper parameters optimization. Used mainly ensemble methods: Random Forest, Gradient Boosting (XGBoost (opens new window)), Ada Boost.
Applied in many cases: crude flat price, shipping prices, Platts physical premia, etc. Used regression results to run and backtest simple trading strategies
NLP
NLP/NER (Natural Language Processing / Named Entity Recognition): parsing of shipping broker reports using Spacy (opens new window). Used to complete shipping data (AIS (opens new window)) and analyse correlations.
Structuring
Priced and booked complex payoffs for many years: tarn, double discount swaps, end-of-month options, callable options, storage contracts, npq gas contracts, options on spreads, barrier options, basket options, size options etc. Pricing and management of options on illiquid underlyings (Maya and JCC).
Implemented many payoffs, in various pricing libraries. Monte Carlo Payoff scripting in Lua.
Modelling
Developped a time spread model. Model used daily to price Calendar Spread Options (CSO), and storages mark-to-market.
Evolved, managed and calibrated all oil models for many years: local volatility, intra-curve and inter-curve correlation, spread and time-spread vol.
Calibration of parametric correlation within the LMM (Libor Market Model (opens new window)). Copula calibration for CMS-Spread.
Managed linear basket proxies for illiquid underlyings: Maya crude, Japan Crude Cocktail (JCC), petrochemical indices, full refinery margins. And priced options on these underlyings.
Built an automatic forward curve builder for oil futures markets.
Implemented American Monte Carlo for forward callable swaps (Longstaff-Schwartz algorithm).
Credit risk pricing: cva and cvar, using simple Poisson process to simulate the default process.
Integration, testing and validation of the Ito33 (opens new window) library for convertible bonds and CB options (within the Sophis toolkit).
Others : Implementation of adapters and auto-calibration (Hagan) for pricing Bermudan Swaptions on Black-Derman-Toy model. Implementation of the SABR model (calibration and pricing). Quadratic programming under constraints for intra-day Yield curve updating. Brownian Bridge for Range Accruals. Inflation Future pricer (deterministic).
Trading tools
Implemented a portfolio heding simulator and back-testing engine, compatible with all payoffs and models, using historical or simulated market data.
Front end
Built a site for the follow-up and monitoring of all team projects and services, using Vue.js (opens new window). Used Prisma (opens new window) to define the data model and mount a graphql server and playground, with automatically generated CRUD resolvers, plus custom resolvers written in javascript.
Other projects based on Python (Flask (opens new window) and Django (opens new window)).
Tooling
Built a python package and CLI tool used to automatize the creation of REST micro-services. Using cookiecutter to instantiate template, marshmallow (ORM lib) , flask-rest-api, mongoengine, and kerberos authentification.
Computing
Built a python package to send Monte Carlo computations to Microsoft HPC (opens new window) (High Performance Computing).
Python code to manage LSF (opens new window) (IBM) Monte Carlo jobs on Pangea (opens new window) (Total supercomputer).
Architecture
Design and implementation of a new pricing library framework: const functors published in thread-safe namespaces; composite pattern for unifying APIs I/O in Excel/Com/C++; Smart pointers, serialization, object-oriented workflow in Excel; metadata for automatic type-checking and self-documentation.
Implemented a regression testing framework for XL and COM APIs.
Refactoring
Migrated pricing library from x86 to x64 (C++, xll addin, and embedded Python). Used Python to rewrite thousands of lines of C++ code.
Factorization of Xll and com APIs within the Adfin Library (Reuters).