Further insights into who I am and the purpose of this blog.
Adrien Kouki
ML Engineer
L'Oréal
Hello! I'm Adrien Kouki (aka Kooks)
I have a passion for ML applications and data engineering. I'm currently an ML engineer at L'Oréal, deeply engaged in the Worldwide cosmetics regulations project, which entails agentic AI, US, EU and CN regulatory landscape. A challenging context in which I build tools for automatic analysis and exposure. I work mainly with Python, GCP, SQL, Go, FastAPI, and AWS.
Why have this blog?
Over the years, I've tackled some pretty interesting challenges in ML systems and data infrastructure. This blog is where I share the solutions, gotchas, and lessons learned from those experiences. Think of it as my technical notebook made public.
Writing about complex problems forces me to really understand them, and I've found that other engineers often face similar challenges. So why not share what works (and what doesn't)? Each post is essentially me saying "here's what I learned the hard way, maybe it'll save you some time."
Always happy to discuss different approaches and hear how others have solved similar problems .
My credentials
I hold a Master's degree in Computer Science from IMT Atlantique, where I specialized in Machine Learning and Data Engineering. I also hold a Research Master's in Artificial Intelligence from Université Paris-Saclay. Additionally, I've obtained several certifications:
AWS Certified Machine Learning Associate
AWS Certified Machine Learning Associate - early adopter
Generative AI Implementation: Designing and implementing advanced prompt engineering systems for automated marketing claims evaluation, reducing manual review time by 80%.
Data Pipeline Architecture: Developing web scraping solutions and building robust ETL processes using dbt and Dagster, enabling seamless data integration across data stores.
Knowledge Graph Innovation: Conducting research and implementation studies to map complex relationships between product ingredients, claims, and regulatory requirements, creating a unified linked information ecosystem.
Generative AI & OCR Integration: I integrated generative AI models (LLMs) and Optical Character Recognition (OCR) technologies to automate the processing and analysis of legal documents, improving data processing efficiency.
API Design & Maintenance: I designed and maintained efficient APIs to facilitate access to AI services and models, ensuring their performance and reliability.Acted as an information relay with product, devops and data model teams.
Data Pipelines & Quality Control: I ensured efficient data pipeline integration with rigorous quality controls to guarantee data integrity and efficient information flow within the system.
Led a team of 4 data engineers in the development and implementation of supply chain solutions for manufacturing clients in the aircraft, jewelry, and automotive industries. ETL with Apache Airflow on Azure/OVH.
Leveraged expertise in modern data engineering tools, including Apache Airflow, Kubernetes, DBT Core, Terraform, and Ansible, to build and maintain robust data pipelines and infrastructure.
Utilized Agile project management methodologies, with a focus on clear communication and task management using Linear, to ensure on-time and within-budget project delivery.
In close contact with stock optimization and configurability teams, I assisted in delivering the right data to enable sufficient functionality for client demos.
My Tech stack of choice
I'm tech-agnostic and highly adaptable - I believe in choosing the right tool for the job rather than being locked into any specific ecosystem. While I have experience across various platforms (including Google Cloud), here's the stack I'm most experienced with and typically reach for when building scalable ML and data engineering solutions:
Python - My go-to for ML, data processing, and backend services
TypeScript - For robust frontend and Node.js applications
Docker - Containerization for consistent deployments
Golang - For serverless microservices design and high-performance backends
AWS - My preferred cloud platform for everything
Terraform - Infrastructure as code to provision cloud providers while remaining cloud-agnostic
AWS Bedrock - For building agentic AI and LLM applications
AWS Data Suite - Glue, Athena, Redshift, and EMR for data engineering
FastAPI - Building high-performance APIs and ML services
SQL - Data querying and analytics across various databases