Method | Rank | Tanks and Temples | Mip-NeRF 360 | Deep Blending | Synthetic NeRF | category | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
PSNR | SSIM | LPIPS | Size [MB] | k Gauss | b/G | PSNR | SSIM | LPIPS | Size [MB] | k Gauss | b/G | PSNR | SSIM | LPIPS | Size [MB] | k Gauss | b/G | PSNR | SSIM | LPIPS | Size [MB] | k Gauss | b/G | |||
HAC-lowrate | 4.3 | 24.04 | 0.846 | 0.187 | 8.5 | 848 | 80 | 27.53 | 0.807 | 0.238 | 16.0 | 2,166 | 59 | 29.98 | 0.902 | 0.269 | 4.6 | 568 | 64 | 33.24 | 0.967 | 0.037 | 1.2 | 99 | 98 | c |
HAC-highrate | 4.3 | 24.40 | 0.853 | 0.177 | 11.8 | 1,025 | 92 | 27.77 | 0.811 | 0.230 | 22.9 | 2,264 | 81 | 30.34 | 0.906 | 0.258 | 6.7 | 624 | 85 | 33.71 | 0.968 | 0.034 | 2.0 | 143 | 109 | c |
GaussianPro | 24.09 | 0.862 | 0.185 | 357.4 | 1,441 | 1984 | 27.43 | 0.813 | 0.219 | 844.1 | 3,403 | 1984 | 29.79 | 0.913 | 0.222 | 640.5 | 2,582 | 1984 | d | |||||||
LightGaussian | 15.2 | 23.11 | 0.817 | 0.231 | 22.0 | 27.28 | 0.805 | 0.243 | 42.0 | 32.72 | 0.965 | 0.037 | 7.8 | c | ||||||||||||
Mini-Splatting | 23.18 | 0.835 | 0.202 | 200 | 27.34 | 0.822 | 0.217 | 490 | 29.98 | 0.908 | 0.253 | 350 | d | |||||||||||||
Mini-Splatting-D | 23.23 | 0.853 | 0.140 | 4,280 | 27.51 | 0.831 | 0.176 | 4,690 | 29.88 | 0.906 | 0.211 | 4,630 | d | |||||||||||||
EAGLES-Small | 16.2 | 23.10 | 0.82 | 0.22 | 19.0 | 650 | 234 | 26.94 | 0.80 | 0.25 | 47.0 | 1,330 | 283 | 29.92 | 0.90 | 0.25 | 33.0 | 1,190 | 222 | c | ||||||
EAGLES | 13.5 | 23.37 | 0.84 | 0.20 | 29.0 | 27.23 | 0.81 | 0.24 | 54.0 | 29.86 | 0.91 | 0.25 | 52.0 | c | ||||||||||||
gsplat-1.00M | 4.8 | 24.03 | 0.857 | 0.163 | 16.1 | 1,000 | 129 | 27.29 | 0.811 | 0.229 | 16.0 | 1,000 | 128 | c | ||||||||||||
Color-cued GS | 23.18 | 0.830 | 0.198 | 42.0 | 370 | 908 | 27.07 | 0.797 | 0.249 | 73.0 | 646 | 904 | 29.71 | 0.902 | 0.255 | 72.0 | 644 | 894 | d | |||||||
Compact3DGS+PP | 12.8 | 23.32 | 0.831 | 0.202 | 20.9 | 27.03 | 0.797 | 0.247 | 29.1 | 29.73 | 0.900 | 0.258 | 23.8 | 32.88 | 0.968 | 0.034 | 2.8 | c | ||||||||
Compact3DGS | 14.5 | 23.32 | 0.831 | 0.201 | 39.4 | 836 | 377 | 27.08 | 0.798 | 0.247 | 48.8 | 1,388 | 281 | 29.79 | 0.901 | 0.258 | 43.2 | 1,058 | 326 | 33.33 | 0.968 | 0.034 | 5.8 | c | ||
AtomGS | 23.70 | 0.849 | 0.166 | 367.2 | 1,480 | 1984 | 27.38 | 0.816 | 0.211 | 779.0 | 3,140 | 1984 | d | |||||||||||||
CompGS | 8.5 | 23.70 | 0.837 | 0.208 | 10.1 | 235 | 342 | 27.26 | 0.803 | 0.239 | 17.3 | 493 | 281 | 29.69 | 0.901 | 0.279 | 9.2 | 229 | 320 | c | ||||||
Scaffold-GS | 13.7 | 23.96 | 0.853 | 0.177 | 87.0 | 27.50 | 0.806 | 0.252 | 156.0 | 30.21 | 0.906 | 0.254 | 66.0 | c | ||||||||||||
SOG w/o SH | 10.0 | 23.15 | 0.828 | 0.198 | 9.3 | 1,207 | 62 | 26.56 | 0.791 | 0.241 | 16.7 | 2,149 | 62 | 29.12 | 0.892 | 0.270 | 5.7 | 800 | 57 | 31.37 | 0.959 | 0.043 | 2.0 | 175 | 89 | c |
SOG | 12.4 | 23.56 | 0.837 | 0.186 | 22.8 | 1,242 | 147 | 27.08 | 0.799 | 0.230 | 40.3 | 2,176 | 148 | 29.26 | 0.894 | 0.268 | 17.7 | 890 | 159 | 33.23 | 0.966 | 0.034 | 4.1 | 157 | 210 | c |
Compact3D 16K | 8.5 | 23.39 | 0.836 | 0.200 | 12.0 | 27.03 | 0.804 | 0.243 | 18.0 | 29.90 | 0.906 | 0.252 | 12.0 | c | ||||||||||||
Compact3D 32K | 8.1 | 23.44 | 0.838 | 0.198 | 13.0 | 520 | 200 | 27.12 | 0.806 | 0.240 | 19.0 | 845 | 180 | 29.90 | 0.907 | 0.251 | 13.0 | 554 | 188 | c | ||||||
Compressed3D | 12.0 | 23.32 | 0.832 | 0.194 | 17.3 | 26.98 | 0.801 | 0.238 | 28.8 | 29.38 | 0.898 | 0.253 | 25.3 | 32.94 | 0.967 | 0.033 | 3.7 | c | ||||||||
Reduced3DGS | 9.5 | 23.57 | 0.840 | 0.188 | 14.0 | 680 | 165 | 27.10 | 0.809 | 0.226 | 29.0 | 1,460 | 159 | 29.63 | 0.902 | 0.249 | 18.0 | 1,010 | 143 | c | ||||||
Octree-GS | 24.68 | 0.866 | 0.153 | 443 | 28.05 | 0.819 | 0.217 | 657 | 30.49 | 0.912 | 0.241 | 112 | d | |||||||||||||
Taming3DGS | 23.89 | 0.835 | 0.207 | 290 | 27.29 | 0.799 | 0.253 | 630 | 27.79 | 0.822 | 0.263 | 270 | d | |||||||||||||
Taming3DGS (Big) | 24.04 | 0.851 | 0.170 | 1,840 | 27.79 | 0.822 | 0.205 | 3,310 | 30.14 | 0.907 | 0.235 | 2,810 | d | |||||||||||||
RDO-Gaussian | 9.1 | 23.34 | 0.835 | 0.195 | 12.0 | 907 | 106 | 27.05 | 0.802 | 0.239 | 23.5 | 1,859 | 101 | 29.63 | 0.902 | 0.252 | 18.0 | 1,474 | 98 | 33.12 | 0.967 | 0.035 | 2.3 | 132 | 139 | c |
IGS low | 5.7 | 23.70 | 0.836 | 0.227 | 8.9 | 1,278 | 55 | 27.34 | 0.811 | 0.255 | 13.4 | 2,092 | 51 | 30.63 | 0.904 | 0.293 | 6.6 | 1,536 | 35 | 33.36 | 0.971 | 0.036 | 1.9 | 157 | 99 | c |
IGS high | 6.4 | 24.05 | 0.849 | 0.211 | 13.1 | 1,278 | 82 | 27.62 | 0.820 | 0.245 | 27.0 | 2,092 | 103 | 32.33 | 0.924 | 0.253 | 8.1 | 1,536 | 42 | 34.18 | 0.975 | 0.032 | 2.9 | 157 | 145 | c |
MesonGS c3 | 11.5 | 23.29 | 0.835 | 0.197 | 17.4 | 1,162 | 119 | 26.99 | 0.797 | 0.246 | 25.9 | 1,870 | 111 | 29.48 | 0.903 | 0.252 | 29.0 | 2,022 | 115 | 32.96 | 0.968 | 0.033 | 3.5 | 207 | 135 | c |
MesonGS c1 | 12.2 | 23.31 | 0.835 | 0.196 | 18.5 | 1,239 | 119 | 26.99 | 0.796 | 0.247 | 28.5 | 2,082 | 109 | 29.50 | 0.903 | 0.251 | 31.1 | 2,166 | 115 | 32.94 | 0.968 | 0.033 | 3.9 | 235 | 131 | c |
3DGS-30K | 14.5 | 23.14 | 0.841 | 0.183 | 411.0 | 1,783 | 1843 | 27.21 | 0.815 | 0.214 | 734.0 | 3,362 | 1746 | 29.41 | 0.903 | 0.243 | 676.0 | 2,975 | 1817 | 33.32 | 3 |
The best methods in each category are highlighted with
gold,
silver, and
bronze colors.
The ranks represent the average rankings of the methods across all available and selected datasets.
The slider above the table allows you to toggle between ranking methods (compression and compaction) as well
as pre-selected shown attributes, datasets, and methods.
Selections can be further refined using the checkboxes below the table.
According to our definitions, the compression rank seeks the smallest file size for the highest possible quality,
while the compaction rank assesses the minimal number of Gaussians in relation to image quality.
While all selected datasets contribute to the compression rank, the compaction rank is limited only
to the Tanks and Temples, Mip-NeRF 360 and Deep Blending datasets.
The quality metrics PSNR, SSIM, and LPIPS are equally weighted with the model size for the compression rank,
and with the number of Gaussians for the compaction rank.
When deviating from the pre-selected attributes, the ranking calculation dynamically adjusts to the remaining metrics.
The formula for calculating the dataset ranks based on the current selection is provided below.
The paper proposes a Hash-grid Assisted Context (HAC) framework for compressing 3D Gaussian Splatting (3DGS) models by leveraging the mutual information between attributes of unorganized 3D Gaussians (anchors) and hash grid features. Using Scaffold-GS as a base model, HAC queries the hash grid by anchor location to predict anchor attribute distributions for efficient entropy coding. The framework introduces an Adaptive Quantization Module (AQM) to dynamically adjust quantization step sizes. Furthermore, this method employs adaptive offset masking with learnable masks to eliminate invalid Gaussians and anchors, by leveraging the pruning strategy introduced by Compact3DGS and additionally removing anchors if all the attached offsets are pruned.
This approach leverages 3D Gaussian Splatting as Markov Chain Monte Carlo (3DGS-MCMC), interpreting the training process of positioning and optimizing Gaussians as a sampling procedure rather than minimizing a predefined loss function. Additionally, it incorporates compression techniques derived from the SOG paper, which organizes the parameters of 3DGS in a 2D grid, capitalizing on perceptual redundancies found in natural scenes, thereby significantly reducing storage requirements. Further compression is achieved by applying methods from Making Gaussian Splats more smaller, which reduces the size of Gaussian splats by clustering spherical harmonics into discrete elements and storing them as FP16 values. This technique is implemented in gsplat, an open-source library designed for CUDA-accelerated differentiable rasterization of 3D Gaussians, equipped with Python bindings.
This method introduces a hybrid representation for splatting-based radiance fields, where Gaussian primitives are separated into explicit point cloud and implicit attribute features. The attribute features are encoded using a multi-resolution multi-level tri-plane architecture integrated with a residual-based rendering pipeline. It employs a level-based progressive training scheme for joint optimization of point clouds and tri-planes, starting with coarse attributes and refining them with higher-level details. Spatial regularization and a bootstrapping scheme are applied to enhance the consistency and stability of the Gaussian attributes during training.
This paper introduces RDO-Gaussian, an end-to-end Rate-Distortion Optimized 3D Gaussian representation. The authors achieve flexible, continuous rate control by formulating 3D Gaussian representation learning as a joint optimization of rate and distortion. Rate-distortion optimization is realized through dynamic pruning and entropy-constrained vector quantization (ECVQ). Gaussian pruning involves learning a mask to eliminate redundant Gaussians and adaptive SHs pruning assigns varying SH degrees to each Gaussian based on material and illumination needs. The covariance and color attributes are discretized through ECVQ, which performs vector quantization.
This approach addresses three main issues contributing to large storage sizes in 3D Gaussian Splatting (3DGS). To reduce the number of 3D Gaussian primitives, the authors introduce a scale- and resolution-aware redundant primitive removal method. This extends opacity-based pruning by incorporating a redundancy score to identify regions with many low-impact primitives. To mitigate storage size due to spherical harmonic coefficients, they propose adaptive adjustment of spherical harmonic (SH) bands. This involves evaluating color consistency across views and reducing higher-order SH bands when view-dependent effects are minimal. Additionally, recognizing the limited need for high dynamic range and precision for most primitive attributes, they develop a codebook using K-means clustering and apply 16-bit half-float quantization to the remaining uncompressed floating point values.
Compressing 3D data is challenging, but many effective solutions exist for compressing 2D data (such as images). The authors propose a new method to organize 3DGS parameters into a 2D grid, drastically reducing storage requirements without compromising visual quality. This organization exploits perceptual redundancies in natural scenes. They introduce a highly parallel sorting algorithm, PLAS, which arranges Gaussian parameters into a 2D grid, maintaining local neighborhood structure and ensuring smoothness. This solution is particularly innovative because no existing method efficiently handles a 2D grid with millions of points. During training, a smoothness loss is applied to enforce local smoothness in the 2D grid, enhancing the compressibility of the data. The key insight is that smoothness needs to be enforced during training to enable efficient compression.
MesonGS employs universal Gaussian pruning by evaluating the importance of Gaussians through forward propagation, considering both view-dependent and view-independent features. It transforms rotation quaternions into Euler angles to reduce storage requirements and applies region adaptive hierarchical transform (RAHT) to reduce entropy in key attributes. Block quantization is performed on attribute channels by dividing them into multiple blocks and perform quantization for each block individually, using vector quantization for compressing less important attributes. Geometry is compressed using an octree, and all elements are packed with the LZ77 codec. A finetune scheme is implemented post-training to restore quality.
The authors propose a compressed 3D Gaussian splat representation consisting of three main steps: 1. sensitivity-aware clustering, where scene parameters are measured according to their contribution to the training images and encoded into compact codebooks via sensitivity-aware vector quantization; 2. quantization-aware fine-tuning, which recovers lost information by fine-tuning parameters at reduced bit-rates using quantization-aware training; and 3. entropy encoding, which exploits spatial coherence through entropy and run-length encoding by linearizing 3D Gaussians along a space-filling curve. Furthermore, a renderer for the compressed scenes utilizing GPU-based sorting and rasterization is proposed, enabling real-time novel view synthesis on low-end devices.
This approach introduces a Gaussian volume mask to prune non-essential Gaussians and a compact attribute representation for both view-dependent color and geometric attributes. The volume-based masking strategy combines opacity and scale to selectively remove redundant Gaussians. For color attribute compression, spatial redundancy is exploited by incorporating a grid-based (Instant-NGP) neural field, allowing efficient representation of view-dependent colors without storing attributes per Gaussian. Given the limited variation in scale and rotation, geometric attribute compression employs a compact codebook-based representation to identify and reuse similar geometries across the scene. Additionally, the authors propose quantization and entropy-coding as post-processing steps for further compression.
The authors of this approach observed that in 3DGS, the color and rotation attributes account for over 80% of memory usage; thus, they propose compressing these attributes via a latent quantization framework. Additionally, they quantize the opacity coefficients of the Gaussians, improving optimization and resulting in fewer floaters or visual artifacts in novel view reconstructions. To reduce the number of redundant Gaussians resulting from frequent densification (via cloning and splitting), the approach employs a pruning stage to identify and remove Gaussians with minimal influence on the full reconstruction. For this, an influence metric is introduced, which considers both opacity and transmittance.
Scaffold-GS introduces anchor points that leverage scene structure to guide the distribution of local 3D Gaussians. Attributes like opacity, color, rotation, and scale are dynamically predicted for Gaussians linked to each anchor within the viewing frustum, enabling adaptation to different viewing directions and distances. Initial anchor points are derived by voxelizing the sparse, irregular point cloud from Structure from Motion (SfM), forming a regular grid. To refine and grow the anchors, Gaussians are spatially quantized using voxels, with new anchors created at the centers of significant voxels, identified by their average gradient over N training steps. Random elimination and opacity-based pruning regulate anchor growth and refinement.
LightGaussian aims to transform 3D Gaussians to a more efficient and compact form, avoiding the scalablity issues that arrises from the large number of SfM (Structure from Motion) points for unbounded scenes. Inspired by Network Pruning, the method identifies Gaussians that minimally contribute to scene reconstruction and employs a pruning and recovery process, thereby efficiently reducing redundancy in Gaussian counts while maintaining visual effects. Additionally, LightGaussian utilizes knowledge distillation and pseudo-view augmentation to transfer spherical harmonics efficients to a lower degree. Furthermore, the authors propose a Gaussian Vector Quantization based on the global significance of Gaussians to quantize all redundant attributes, achieving lower bitwidth representations with minimal accuracy losses.
Octree-GS introduces an octree structure to 3D Gaussian splatting. Starting with a sparse point cloud, an octree is constructed for the bounded 3D space, where each level corresponds to a set of anchor Gaussians assigned to different levels of detail (LOD). This method selects the necessary LOD based on the observation view, gradually accumulating Gaussians from higher LODs for final rendering. The model is trained using standard image reconstruction and volume regularization losses.
Mini-Splatting enhances Gaussian distribution through Blur Split, which refines Gaussians in blurred regions, and Depth Reinitialization, which repositions Gaussians based on newly generated depth points, calculated from the mid-point of ray intersections with Gaussian ellipsoids, thus avoiding artifacts from alpha blending. For simplification, Intersection Preserving retains Gaussians with the greatest visual impact, while Sampling maintains geometric integrity and rendering quality, reducing complexity.
This method employs a global scoring approach to guide the addition of Gaussians, ensuring efficient densification. Each Gaussian is assigned a score based on four factors: 1) gradient, 2) pixel coverage, 3) per-view saliency, and 4) core attributes like opacity, depth, and scale. Gaussians with the top B scores, where B is the desired number of new Gaussians, are then split or cloned to optimize the scene's representation. By calculating a composite score that reflects both the scene’s structural complexity and visual importance, only the most critical areas are targeted for Gaussian splitting or cloning, resulting in more effective scene representation.
This method prioritizes fine details through Atom Gaussians, which are isotropic and uniformly sized to align closely with the scene's geometry, while large Gaussians are merged to cover smooth surfaces. In addition, Geometry-Guided Optimization uses an Edge-Aware Normal Loss and multi-scale SSIM to maintain geometric accuracy. The Edge-Aware Normal Loss is calculated as the product of the normal map, derived from the pre-optimized 3DGS, and the edge map, which is derived from the gradient magnitude of the ground truth RGB image.
This method generates depth and normal maps that guide the growth and adjustment of Gaussians. It employs patch matching to propagate depth and normal information from neighboring pixels to generate new values. Geometric filtering and selection then identify pixels needing additional Gaussians, which are initialized using the propagated information. It also introduces a planar loss to ensure Gaussians match real surfaces more closely. This method enforces consistency between the Gaussian's rendered normal and the propagated normal using L1 and angular loss.
This method introduces a simple yet effective modification to the densification process in the original 3D Gaussian Splatting (3DGS). It leverages the view-independent (0th) spherical harmonics (SH) coefficient gradient to better assess color cues for densification, while using the 2D position gradient more coarsely to refine areas where structure-from-motion (SfM) struggles to capture fine structures.
If you use our survey in your research, please cite our work. You can use the following BibTeX entry:
@misc{3DGSzip2024,
title={3DGS.zip: A survey on 3D Gaussian Splatting Compression Methods},
author={Milena T. Bagdasarian and Paul Knoll and Yi-Hsin Li and Florian Barthel and Anna Hilsmann and
Peter Eisert and Wieland Morgenstern},
year={2024},
eprint={2407.09510},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2407.09510},
}