Back

Papers

Research papers I've read over the years — with notes on what I took away and why it mattered.

AI AgentsComputer VisionDocument AIEfficient MLLLMsMultimodalNLP / ArchitecturesNLP / Retrieval

AI Agents

Computer Vision

End-to-End Object Detection with Transformers

Carion et al. · ECCV 2020 · 2020

Eliminated the need for hand-designed anchors and NMS. Directly relevant to my thesis work on attention-based detection architectures at TUM.

Object DetectionTransformersDETR

Document AI

Efficient ML

LLMs

Scaling Laws for Neural Language Models

Kaplan et al. · arXiv 2020 · 2020

Empirical power-law relationships between compute, data, and model size. Informed how I reason about the cost-performance tradeoff when choosing models for production.

ScalingLLMsEmpirical

Multimodal

NLP / Architectures

Attention Is All You Need

Vaswani et al. · NeurIPS 2017 · 2017

The paper that reshaped the field. Self-attention replacing recurrence was the key insight. I re-read this every time I need to reason about sequence models from first principles.

TransformersAttentionNLP

NLP / Retrieval