Mark Boss

Co-Head of 3D & Image

Biography

Mark Boss is the Co-Head of 3D & Image at Stability AI. He worked at Unity Technologies before and completed his PhD at the University of Tübingen in the computer graphics group of Prof. Hendrik Lensch. His research interests lie at the intersection of machine learning and computer graphics, focusing mainly on inferring physical properties (shape, material, illumination) from images.

If you are interested in a research collaboration, please drop me an email with your CV.

Education

PhD in Computer Science

2023

University of Tübingen

MSc in Computer Science

2018

University of Tübingen

BSc in Computer Science

2016

Osnabrück University of Applied Sciences

News

ReSWD Accepted at ECCV 2026

Jun 23, 2026

ReSWD got accepted to ECCV 2026. Check out the project page, code, and demo.

Jun 23, 2026

Three New Papers on ArXiv

May 7, 2026

Three new papers are now available on ArXiv:

OCTOPUS: Optimized KV Cache for Transformers via octahedral triplet quantization — state-of-the-art compression across text, video, and audio.
Stable-Layers: Fine-tuning image layer decomposition models with VLM-scored reinforcement learning.
Arbor: Explicit geometric conditioning for controllable 3D asset generation via typed constraint meshes.

May 7, 2026

ReLi3D Accepted at ICLR 2026

Jan 23, 2026

ReLi3D got accepted to ICLR 2026. Check out the project page, code, and models.

Jan 23, 2026

Two Papers Accepted at ICCV 2025

Sep 29, 2025

Stable Virtual Camera and SViM3D got accepted to ICCV 2025.

Sep 29, 2025

Three Papers Accepted at CVPR 2025

Jun 5, 2025

SF3D, SPAR3D, and MARBLE got accepted to CVPR 2025.

Jun 5, 2025

See all

Recent & Upcoming Talks

Oct 19, 2025 11:55 AM — 12:25 PM · ICCV 2025

No Two Alike: Generation and editing, Instance by Instance

ILR+G Workshop @ ICCV

Instance-level generation and editing and the application in profession media production. This talk discusses recent advances in this field.

Jun 12, 2025 11:00 AM — 11:30 AM · CVPR 2025

Generative AI for VFX & Games

AI4CC Workshop @ CVPR

Generative AI can unlock several tasks and subtasks in profession media production. This talk discusses recent advances in this field.

PDF

Nov 28, 2023 4:40 PM — 5:15 PM · Hochschule der Medien, Stuttgart

Inverse Rendering for Games

Games Day 2023

Asset production in the game industry is time-consuming, and since "The Vanishing of Ethan Carter" photogrammetry has gained traction. While the asset produced by photogrammetry achieves incredible detail, the illumination is baked into the texture maps. This makes the assets inflexible and limits their use in games and movies without manual post-processing. In this talk, I will present our recent work on decomposing an object into its shape, reflectance, and illumination. This highly ill-posed problem is inherently more challenging when the illumination is not a single light source under laboratory conditions but is an unconstrained environmental illumination. Decomposing an object under this ambiguous setup enables the automated creation of relightable 3D assets for AR/VR applications, enhanced shopping experiences, games, and movies from online images. In this talk, I will present our recent methods in the field of reflectance decomposition using Neural Fields. Our methods are capable of building a neural volumetric reflectance decomposition from unconstrained image collections. Contrary to most recent works that require images to be captured under the same illumination, our input images are taken under varying illuminations. This practical setup enables the decomposition of images gathered from online searches and the automated creation of relightable 3D assets. Our techniques handle complex geometries with non-Lambertian surfaces, and we also extract 3D meshes with material properties from the learned reflectance volumes enabling their use in existing graphics engines. In our last method, we also enable the decomposition of unposed image collections. Most recent reconstruction methods require posed collections. However, common pose recovery methods fail under highly varying illuminations or locations.

Slides Video

Jul 26, 2022 10:00 AM — 11:00 AM · Adobe Research

Neural Reflectance Decomposition

In this talk, I will present our recent work on decomposing an object into its shape, reflectance, and illumination. This highly ill-posed problem is inherently more challenging when the illumination is not a single light source under laboratory conditions but is an unconstrained environmental illumination. Decomposing an object under this ambiguous setup enables the automated creation of relightable 3D assets for AR/VR applications, enhanced shopping experiences, games, and movies from online images.In this talk, I will present our recent methods in the field of reflectance decomposition using Neural Fields. Our methods are capable of building a neural volumetric reflectance decomposition from unconstrained image collections. Contrary to most recent works that require images to be captured under the same illumination, our input images are taken under varying illuminations. This practical setup enables the decomposition of images gathered from online searches and the automated creation of relightable 3D assets. Our techniques handle complex geometries with non-Lambertian surfaces, and we also extract 3D meshes with material properties from the learned reflectance volumes enabling their use in existing graphics engines. In our last method, we also enable the decomposition of unposed image collections. Most recent reconstruction methods require posed collections. However, common pose recovery methods fail under highly varying illuminations or locations.

Slides

Publications

Jun 22, 2026

Arbor: Explicit Geometric Conditioning for Controllable 3D Asset Generation

arXiv

A trainable adapter for 3D generators that introduces explicit geometric control via typed constraint meshes (hull, avoidance, touch).

PDF Site Code Read more

May 28, 2026

Stable-Layers: Fine-Tuning Image Layer Decomposition Models with VLM-Scored Reinforcement Learning

arXiv

An RL framework that fine-tunes image layer decomposition models using VLM-as-judge rewards, eliminating paired supervision.

PDF Site Read more

May 21, 2026

OCTOPUS: Optimized KV Cache for Transformers via Octahedral Parametrization Under Optimal Squared Error Quantization

arXiv

A rotation-preconditioned KV cache codec that jointly quantizes coordinate triplets via an octahedral map, achieving state-of-the-art compression across text, video, and audio modalities.

PDF Site Video Read more

Jan 23, 2026

ReLi3D: Relightable Multi-view 3D Reconstruction with Disentangled Illumination

ICLR 2026

A unified feed-forward pipeline for relightable 3D reconstruction from sparse views in under one second.

PDF Site Code Model Read more

Oct 2, 2025

ReSWD: ReSTIR‘d, not shaken. Combining Reservoir Sampling and Sliced Wasserstein Distance for Variance Reduction.

ECCV 2026

Distribution matching is central to many vision and graphics tasks, where the widely used Wasserstein distance is too costly to compute for high dimensional distributions. The Sliced Wasserstein Distance (SWD) offers a scalable alternative, yet its Monte Carlo estimator suffers from high variance, resulting in noisy gradients and slow convergence. We introduce Reservoir SWD (ReSWD), which integrates Weighted Reservoir Sampling into SWD to adaptively retain informative projection directions in optimization steps, resulting in stable gradients while remaining unbiased. Experiments on synthetic benchmarks and real-world tasks such as color correction and diffusion guidance show that ReSWD consistently outperforms standard SWD and other variance reduction baselines.

PDF Site Code Demo Read more

Sep 30, 2025

SViM3D: Stable Video Material Diffusion for Single Image 3D Generation

In ICCV

We present Stable Video Materials 3D (SViM3D), a framework to predict multi-view consistent physically based rendering (PBR) materials, given a single image. Recently, video diffusion models have been successfully used to reconstruct 3D objects from a single image efficiently. However, reflectance is still represented by simple material models or needs to be estimated in additional pipeline steps to enable relighting and controlled appearance edits. We extend a latent video diffusion model to output spatially-varying PBR parameters and surface normals jointly with each generated RGB view based on explicit camera control. This unique setup allows for direct relighting in a 2.5D setting, and for generating a 3D asset using our model as neural prior. We introduce various mechanisms to this pipeline that improve quality in this ill-posed setting. We show state-of-the-art relighting and novel view synthesis performance on multiple object-centric datasets. Our method generalizes to diverse image inputs, enabling the generation of relightable 3D assets useful in AR/VR, movies, games and other visual media.

PDF Site Read more

Jun 5, 2025

MARBLE: Material Recomposition and Blending in CLIP-Space

In CVPR

Editing materials of objects in images based on exemplar images is an active area of research in computer vision and graphics. We propose MARBLE, a method for performing material blending and recomposing fine-grained material properties by finding material embeddings in CLIP-space and using that to control pre-trained text-to-image models. We improve exemplar-based material editing by finding a block in the denoising UNet responsible for material attribution. Given two material exemplar-images, we find directions in the CLIP-space for blending the materials. Further, we can achieve parametric control over fine-grained material attributes such as roughness, metallic, transparency, and glow using a shallow network to predict the direction for the desired material attribute change. We perform qualitative and quantitative analysis to demonstrate the efficacy of our proposed method. We also present the ability of our method to perform multiple edits in a single forward pass and applicability to painting.

PDF Site Code Demo Read more

See all

Experience

Co-Head of 3D & Image

Stability AI

October 2025 – Present

Leading research in generative AI for 3D.

Research Scientist

Stability AI

January 2024 – October 2025

Research on object acquisition, usage of deep priors to reduce ambiguities, and generative AI.

Senior Research Scientist

Unity

September 2022 – January 2024

Research on object acquisition and generative AI.

Ph.D. Student

University of Tübingen

June 2018 – June 2022

Research on deep learning based material acquisition.

Student Researcher

Google

June 2021 – April 2022

Research on novel techniques for material, geometry and illumination disentanglement.

Research Intern

NVIDIA

April 2019 – July 2019

Research on casual shape and material acquisition.

Android Developer

zahlz

July 2015 – June 2017

Development of an Android Application for a mobile payment system.

Education

PhD in Computer Science

University of Tübingen

Present

MSc in Computer Science

University of Tübingen

Present

BSc in Computer Science

Osnabrück University of Applied Sciences

Present

Projects

Video

Framewise

A native macOS app for side-by-side video comparison with a split-view slider and error visualization. Built with Swift, Metal, and AVFoundation.

Apr 10, 2026 • 1 min read

Site Read more

Open Source

tz

A fast timezone converter for the command line and Alfred. Supports timezone abbreviations, city names, country names, and automatic daylight saving time handling.

Mar 31, 2026 • 1 min read

Site Read more

Open Source

Postergeist

Academic poster generator that converts Markdown files to beautiful HTML posters with live preview, drag-and-drop editing, and PDF export. Also supports a Claude Code Skill to turn …

Mar 17, 2026 • 1 min read

Site Read more

Open Source

Outline.md

A cross-platform, markdown-based hierarchical outline editor built with Flutter. Create structured documents using familiar markdown headings, export to LaTeX, and organize your …

Mar 12, 2026 • 1 min read

Site Read more

Open Source

GifDrop

GifDrop is a desktop app that converts video to GIF and optimizes existing GIFs. Drag and drop your files, tweak quality and size, and export—no command line required. It bundles …

Mar 1, 2026 • 1 min read

Site Read more

Open Source

Citegeist

Checks, standardizes, and upgrades .bib files automatically.

Feb 20, 2026 • 1 min read

Site Read more

See all

NeRF at CVPR 2023

May 1, 2023

It is now my third time writing a summary of NeRFy things at a conference. This time it is the big one: CVPR. The list of accepted papers is massive again, with 2359 papers.

May 1, 2023

NeRF at ECCV 2022

Oct 1, 2022

I recently went through the the provisional programm of ECCV 2022. After my last post on “NeRF at NeurIPS” got such great feedback, and I anyway compiled a list of all NeRFy things, I decided to do it all again.

Oct 1, 2022

NeRF at NeurIPS 2022

Sep 18, 2022

Inspired by Frank Dellaert and his excellent series on the original NeRF Explosion and the following ICCV/CVPR conference gatherings, I decided to look into creating a NeurIPS 22 rundown myself.

The papers below are all the papers I could gather by browsing through the extensive list of accepted NeurIPS papers. I mainly collected all papers where the titles fit and did a brief scan through the paper or only the abstract if the paper wasn’t published at the time of writing. If I have mischaracterized or missed any paper, please send me a DM on Twitter @markb_boss or via mail.

Sep 18, 2022

No results found

Mark Boss

Biography

Education

No Two Alike: Generation and editing, Instance by Instance

Generative AI for VFX & Games

Inverse Rendering for Games

Neural Reflectance Decomposition

Experience

Co-Head of 3D & Image

Research Scientist

Senior Research Scientist

Ph.D. Student

Student Researcher

Research Intern

Android Developer

Education

PhD in Computer Science

MSc in Computer Science

BSc in Computer Science