How Meshlets Enable Real-Time Geometry Simplification and Refinement Without Full BVH Rebuilds ?!

dodo59209 · Tue Jan 06, 2026 10:43 am

Hello,

I would like to better understand how meshlets work in modern rendering pipelines, especially the key execution stages (CPU and GPU), and how they enable real-time geometry simplification and refinement for very dense meshes, without requiring a full BVH rebuild, even when a streaming system is involved.
Specific Points I’m Looking For Details On

1. Meshlet Construction and Layout
How are meshlets generated from dense meshes?
What constraints typically drive their size and structure (vertex/triangle count, spatial locality, GPU cache efficiency)?
Are there recommended strategies to prepare meshlets for efficient LOD transitions and streaming?

2. Execution Stages in the GPU Pipeline
At which stages do meshlets participate (culling, amplification shader, mesh shader, rasterization)?
How are visible meshlets selected each frame?
How are frustum culling, backface culling, and occlusion culling applied at the meshlet level?

3. Real-Time Geometry Simplification and Refinement
How do meshlets allow increasing or decreasing geometric detail without modifying the global mesh?
Is this typically achieved through:
multiple meshlet LODs,
hierarchical refinement (coarse-to-fine),
GPU-driven selection based on screen-space error?
How fine-grained can refinement be at the meshlet level?

4. Interaction with BVH and Spatial Acceleration
How is a full BVH rebuild avoided when only meshlet LODs change?
Are common approaches based on:
per-meshlet BVHs,
a stable top-level BVH with adjustable or conservative bounds,
partial or incremental BVH updates?
How does this integrate with ray tracing pipelines or software-based culling?

5. Streaming and Memory Management
How are meshlets streamed (GPU residency management, feedback systems, virtual geometry)?
What happens when high-detail meshlets are not yet available?
How are GPU stalls avoided during meshlet replacement or streaming?

6. Trade-offs and Limitations
What are the main costs (CPU, GPU, memory) of a meshlet-based system?
In which scenarios are meshlets not a good fit?
How does this approach compare to classic mesh LOD systems with BVH rebuilds?

Final Goal
The goal is to understand how a meshlet-based, GPU-driven, adaptive rendering system can handle extremely dense scenes while maintaining:
continuous simplification and refinement,
a stable or partially updated BVH,
real-time performance.

Any insights, implementation details, or experience from modern engines or APIs (DX12 Ultimate, Vulkan, Mesh Shaders, Nanite-like systems, etc.) would be greatly appreciated

ievedeck · Wed Jan 28, 2026 3:49 pm

Compared to classic LOD + BVH rebuilds, meshlets trade simplicity for scalability. You pay more upfront complexity, but you gain the ability to render absurdly dense scenes in real time.