© 2025 The authors. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).
OPEN ACCESS
CT imaging technology is a crucial tool for revealing the dynamic evolution of gas-water displacement interfaces within porous media. However, challenges such as insufficient segmentation accuracy, poor robustness in 3D reconstruction, and a lack of systematic morphological quantification hinder deeper understanding of gas-water flow mechanisms. To address these challenges, this paper proposes an integrated technical solution: 1) An improved 3D Attention U-Net segmentation model, which introduces a projection attention module (PAM) before the attention-guided AG module to enhance the effective feature representation of the encoder layer. This enables accurate integration of low-level surface features with high-level abstract features. Additionally, a hybrid loss function with a weight parameter λ is designed to balance class imbalance and boundary segmentation accuracy. 2) A multi-feature fusion-Transformer matching strategy for reconstruction, which integrates dense feature point clouds of the gas-water interface extracted by Oriented FAST and Rotated BRIEF (ORB), Harris corners, and Speeded Up Robust Features (SURF). The global attention mechanism of Transformer is applied to achieve scale-invariant feature point matching from coarse to fine, thus improving the precision of 3D reconstruction. 3) A multi-dimensional morphological quantification index system is developed for quantitative representation of the interface's geometric and dynamic features. Experiments based on real core CT gas-water data demonstrate that the improved 3D Attention U-Net achieves a Dice coefficient of 0.92, IoU of 0.86, mean surface distance of 2.1 μm, and Hausdorff distance (HD) of 5.3 μm, providing pixel-level overlap and submicron boundary restoration. The segmentation results show a high consistency with measured cross-sectional X/Y/Z half-variance curves, with precise matching of local morphology, inflection points, and displacement features. The 3D reconstruction point cloud has an RMSE of 0.021 mm, with relative deviations of 2.3%, 1.8%, and 3.1% in surface area, enclosed volume, and average curvature, respectively. The core morphological parameters have a deviation of less than 2% from measured values, successfully capturing the "expansion-filling-stabilization" three-phase displacement pattern, supporting the analysis of the relationship between "interface morphology and displacement efficiency." This method provides reliable technical support for the quantitative study of gas-water processes in porous media and can be extended to engineering fields such as oil and gas development and CO₂ geological sequestration.
CT gas-water interface, 3D deep segmentation, 3D Attention U-net, PAM, hybrid loss function, ORB/Harris/SURF feature extraction, Transformer feature matching, 3D reconstruction, morphological quantification
The gas-water displacement process in porous media is a core physical mechanism in engineering fields such as oil and gas extraction, CO₂ geological sequestration, and groundwater pollution remediation. The dynamic evolution of the interface directly determines fluid migration efficiency and resource utilization efficiency [1-3]. X-ray CT, with its high resolution and non-invasive advantages [4, 5], has become the mainstream technology for real-time observation of the spatial distribution of gas-water displacement interfaces. By continuous CT scanning, two-dimensional slice images of the interface at different time steps can be obtained, providing a data basis for three-dimensional structural analysis and mechanism revealing.
However, CT gas-water images face three major technical bottlenecks: 1) Insufficient interface segmentation accuracy: The gas-water interface is affected by noise, gray-scale inhomogeneity, and fluid diffusion effects, resulting in blurred boundaries with small gray-scale differences from the background core matrix [6, 7]. Although the existing 3D Attention U-net enhances interface features through attention mechanisms, it is easily disturbed by irrelevant signals when merging low-level features from the encoding area with high-level features from the decoding area, leading to boundary shifts [8, 9]; 2) Poor robustness in three-dimensional reconstruction: Traditional feature point extraction methods struggle to balance "density" and "robustness." Although ORB has strong real-time performance, its feature points are sparse in regions with few textures. Harris corners are sensitive to noise, and SURF, though resistant to scale variations, has high computational complexity [10, 11]. Moreover, feature point matching is easily influenced by scale changes in CT images, leading to point cloud misalignment and decreased reconstruction accuracy [12, 13]; 3) Unsystematic morphological quantification: Existing studies mainly focus on a single metric, lacking the integration of multi-dimensional features such as interface curvature, migration rate, and stability, and are unable to fully characterize the physical essence of interface evolution [14-16]. Therefore, developing an integrated method for high-accuracy segmentation, high-robustness reconstruction, and multi-dimensional quantification is of significant theoretical and practical value for revealing the gas-water seepage mechanism and optimizing engineering design.
To address the above challenges, this paper aims to solve the bottlenecks in CT gas-water interface analysis. The specific objectives include: 1) Proposing an improved 3D Attention U-net segmentation model that enhances the accuracy and robustness of interface segmentation by embedding a PAM and designing a hybrid loss function; 2) Constructing a "multi-feature fusion extraction-Transformer precise matching" three-dimensional reconstruction framework to achieve accurate reconstruction of the gas-water interface’s dense point cloud; 3) Establishing a multi-dimensional morphological quantification index system to achieve quantitative representation of the interface's geometric and dynamic features.
The main contributions of this paper include the following three aspects: 1) Innovation in the segmentation model: Embedding a PAM before the AG module in the 3D Attention U-net, strengthening the feature representation of the encoding area by feature re-projection and attention weighting, and suppressing irrelevant signals. A hybrid loss function with a weight parameter λ is introduced to balance cross-entropy loss and Dice loss; 2) Innovation in the reconstruction method: Fusing ORB, Harris, and SURF to extract dense feature points of the gas-water interface, and using the global attention mechanism of Transformer to achieve "coarse-to-fine" scale-invariant matching, improving point cloud matching accuracy and reconstruction robustness; 3) Innovation in the quantification system: Integrating interface geometric features and dynamic features to build a multi-dimensional quantification index system for the quantitative representation of gas-water interface evolution and its physical significance correlation.
The structure of the paper is arranged as follows: Chapter 2 provides a detailed explanation of the proposed segmentation, reconstruction, and quantification methods; Chapter 3 verifies the effectiveness of the methods through experiments and analyzes the results; Chapter 4 discusses the advantages, limitations, and application value of the methods, summarizes the paper, and outlines future directions.
This chapter details the segmentation-reconstruction-quantification integrated method for CT gas-water interfaces. The method description is as follows: 1) Preprocess the CT sequence images and improve the 3D Attention U-net segmentation to obtain binary images of the gas-water interface; 2) Extract a dense point cloud using multi-feature fusion from the segmented interface images, and perform three-dimensional reconstruction using Transformer matching; 3) Calculate multi-dimensional morphological quantification indices based on the reconstructed 3D model.
2.1 CT gas-water interface segmentation based on the improved 3D Attention U-net
2.1.1 Improved 3D Attention U-net network structure
The improved 3D Attention U-net follows the classic U-shaped architecture, with encoding, decoding, and skip connections as the basic framework. The core innovation lies in the introduction of the PAM and the optimization of the skip connection mechanism, which enhances segmentation accuracy of the gas-water interface by feature enhancement and cross-scale fusion. This architecture extracts abstract features through downsampling in the encoding area, restores spatial resolution through upsampling in the decoding area, and compensates for detail loss caused by downsampling via skip connections. PAM and the optimized connection mechanism specifically address the issue of weak features in the gas-water interface, which are prone to interference from the core matrix background.
Figure 1. Improved 3D Attention U-net network structure for CT gas-water interface segmentation
Figure 1 shows the improved 3D Attention U-net network structure suitable for CT gas-water interface segmentation. The encoding area consists of four 3D convolution blocks and three 3D max-pooling layers alternately. Each convolution block contains two 3×3×3 convolution layers, a batch normalization (BN) layer, and a ReLU activation function. In the figure, the BN layers are simplified and omitted. The convolution blocks gradually enhance the texture and morphological features of the gas-water interface by extracting local features. The max-pooling layers perform downsampling with a 2×2×2 kernel size and a stride of 2, which compresses the feature map size while preserving key structural information. The final output feature map size from the encoding area is 1/8 of the input, effectively extracting high-level abstract features. The decoding area consists of four 3D deconvolution blocks, each consisting of a 2×2×2 deconvolution layer, two 3×3×3 convolution layers, a BN layer, and a ReLU activation function. The deconvolution gradually upsamples to restore the spatial resolution of the feature map and provide structural support for precise segmentation.
PAM is the core enhancement module of the network, embedded between the outputs of the encoding area and the attention-guided AG module. It strengthens target feature representation and suppresses background interference through feature re-projection and attention weighting. The specific process is as follows: First, the 3D feature map Fenc∈RC×H×W×D output by the encoding area, where C is the number of channels, and H, W, and D are the spatial dimensions, is subjected to A/Z/Y/X multi-dimensional 3D average pooling to extract feature information from different spatial dimensions. Then, the pooled features from each dimension are concatenated into a multi-dimensional 3D pooled feature map, which is projected to a lower-dimensional space using a 1×1×1 3D convolution to reduce computational complexity. After BN and ReLU activation function processing, the feature representation ability is enhanced. The processed features are then separated into multiple branches through 3D feature separation, and each branch generates the corresponding attention weights for each dimension through a 1×1×1 3D convolution and a Sigmoid activation function. Finally, the attention weights from each branch are element-wise multiplied with the 3D feature map of the CT gas-water interface output from the original encoding block, resulting in Fatt∈RC×H×W×D. The AG module aligns Fatt with the upsampled feature map Fdec from the decoding area through channel alignment, focusing on the target region via attention weight allocation and outputting the fused feature Ffusion. The optimized skip connection no longer directly transmits the original features from the encoding area but fuses them with the features from the decoding area after enhancement via PAM. This effectively reduces interference from background noise during the fusion process and enhances the effectiveness of cross-scale feature fusion. Figure 2 shows the structure of the PAM for CT gas-water interface 3D segmentation.
Figure 2. Structure of the PAM for CT gas-water interface 3D segmentation
2.1.2 Improved hybrid loss function
Gas-water interface segmentation faces two core challenges: 1) The significant pixel ratio difference between the interface region and the core matrix in CT images [17], resulting in severe class imbalance; 2) The small number of boundary pixels and the gradual gray-scale variation [18], making it difficult for traditional loss functions to accurately capture boundary features. To address these two issues simultaneously, a hybrid loss function Lmix with a weight parameter λ is designed, which combines cross-entropy loss and Dice loss through weighted fusion to balance class distribution and boundary segmentation accuracy.
The cross-entropy loss LCE primarily addresses the class imbalance issue by penalizing misclassified pixels through logarithmic probability. It assigns a higher misclassification cost to the interface pixels, which occupy a very small proportion. The calculation formula is as follows:
$L_{C E}=-\frac{1}{N} \sum_{i=1}^N\left[y_i \log \left(p_i\right)+\left(1-y_i\right) \log \left(1-p_i\right)\right]$ (1)
where, N is the total number of pixels, yi is the true label of the i-th pixel, and pi is the model's predicted probability that the pixel belongs to the interface. The Dice loss LDice enhances boundary segmentation accuracy by measuring the overlap between the predicted and true regions, effectively focusing on the subtle features of the interface boundary. The calculation formula is as follows:
$L_{ {Dice }}=1-\frac{2 \sum_{i=1}^N y_i p_i+\epsilon}{\sum_{i=1}^N y_i^2+\sum_{i=1}^N P_i^2+\epsilon}$ (2)
where, ϵ = 10−5 is a smoothing term to avoid division by zero in extreme cases.
The final form of the hybrid loss function is: Lmix = λ·LCE+(1−λ)·LDice, where the weight parameter λ is used to adjust the contribution of the two types of losses. The optimal λ value is determined by performing a 5-fold cross-validation over the range λ∈[0,1], with the final optimal value set to λ = 0.3. This value allows the cross-entropy loss to sufficiently suppress the bias caused by class imbalance while enabling the Dice loss to fully optimize boundary features, achieving the best balance between interface pixel recognition and boundary morphology characterization.
2.1.3 Dataset construction and preprocessing
The experimental data are sourced from the Micro-CT scanning results of sandstone core gas-water displacement physical simulation experiments. The spatial resolution of the scanning equipment is 50 μm, with a scan step of 1 mm and a time interval of 30 seconds, resulting in dynamic scan data for 100 time steps. Each time step consists of 200 two-dimensional CT slices, with each slice having a size of 512×512. The data covers the entire dynamic process of gas-water displacement from the initial stage to the stable stage, providing rich dynamic feature samples for the interface segmentation model. The data labeling was independently completed by two experts with more than 5 years of experience in rock mechanics research using the LabelMe3D tool. The labeled object is the gas-water interface region in each slice. The intersection of the labeling results from both experts was taken as the gold standard for segmentation, ensuring the accuracy and authority of the labeling results.
The purpose of data preprocessing is to improve image quality, unify data distribution, and increase sample size, thereby providing high-quality input for model training. First, Gaussian filtering is applied for denoising. A filter kernel with σ = 1.0 is selected to smooth image noise while retaining the subtle gray-scale changes of the interface. This parameter is determined by comparing the gray-scale contrast between the interface and background at different σ values, achieving a balance between denoising and feature preservation. Subsequently, a gray-scale normalization operation is performed, linearly mapping the original image's gray-scale values to the range [0,1] with the mapping formula Inorm = (I−Imin)/(Imax−Imin), where I is the original gray-scale value, and Imin and Imax are the minimum and maximum gray-scale values of a single image. This operation eliminates the gray-scale shift between different scan time steps, unifying the data distribution.
To mitigate the issue of model overfitting, data augmentation strategies are adopted to increase the training samples, including random rotation, horizontal and vertical flipping, and random scaling. The rotation and flipping operations simulate slight pose changes of the core during scanning, while the scaling operation enhances the model's ability to adapt to interface features at different scales. The augmented dataset is split into training, validation, and test sets in a 7:2:1 ratio. The training set is used for model parameter iteration and update, the validation set is used for hyperparameter tuning and overfitting monitoring during the training process, and the test set is used for objective evaluation of the model's final segmentation performance. This splitting ratio follows the conventional setup in medical image segmentation and ensures the reliability of the evaluation results.
2.2 Three-dimensional reconstruction of CT gas-water interface based on multi-feature fusion and transformer matching
2.2.1 Dense feature point extraction using multi-feature fusion
Single feature point extraction methods cannot fully cover the complex characteristics of the gas-water interface. ORB features are fast in real-time but have weak scale adaptability, Harris corner detection is sensitive to edge features but lacks noise resistance, and SURF has scale invariance but high computational complexity. Therefore, this paper combines ORB, Harris, and SURF algorithms to extract dense feature points from the interface, improving feature point integrity, stability, and robustness by complementary advantages, which lays the foundation for subsequent accurate matching.
ORB feature point extraction is based on the FAST algorithm to detect corner points. After selecting an initial threshold T = 20, non-maximum suppression is applied to remove redundant points and ensure a sparse and uniform distribution of feature points. The centroid of the gray-scale region is calculated to determine the main direction, and a 256-dimensional rotation-invariant BRIEF descriptor is generated based on this direction, providing rotational invariance to the feature points. Harris feature point extraction is performed by calculating the second-order moment matrix of the image gray-scale:
$M=\left[\begin{array}{cc}\mathrm{I}_{\mathrm{x}}^2 & \mathrm{I}_{\mathrm{x}} \mathrm{I}_{\mathrm{y}} \\ \mathrm{I}_{\mathrm{x}} \mathrm{I}_{\mathrm{y}} & \mathrm{I}_{\mathrm{y}}^2\end{array}\right]$ (3)
where, Ix and Iy are the gray-scale gradients in the x and y directions, respectively. The response value is calculated as R = det(M)−k·(trace(M))2, where (k = 0.04). Points with a response value greater than Rmax×0.01 are selected as corner points. This threshold is experimentally validated and effectively selects key feature points along the interface boundary. SURF feature point extraction detects scale-space extrema using the Hessian matrix, and sub-pixel level feature point localization is achieved through interpolation. The main direction is determined by calculating the Haar wavelet response in the feature point neighborhood, and finally, a 64-dimensional SURF descriptor is generated, ensuring scale invariance.
Multi-feature fusion uses a voting method and weighted fusion strategy to enhance feature quality. The voting method retains the feature point with the highest descriptor matching score at the same pixel location and eliminates redundant information. For non-redundant feature points extracted by the three algorithms, the descriptors are fused using the weighted percentages of ORB (40%), Harris (30%), and SURF (30%). This weight distribution comprehensively considers the advantages of the three features: ORB descriptors have high dimensionality and strong discriminability, Harris corners have accurate positioning, and SURF has good scale adaptability. The weighted fusion descriptors integrate multi-dimensional feature information, significantly improving the robustness of subsequent feature matching.
2.2.2 Feature point accurate matching based on transformer
CT slices of the gas-water interface often have scale differences and gray-scale noise. Traditional feature matching methods are easily affected by scale changes and are difficult to capture the global correlations between feature points, leading to low matching accuracy and high outlier ratios. This paper designs a Transformer-based "from coarse to fine" feature point matching model, which realizes scale-invariant accurate matching through feature encoding, global attention matching, and optimization iterations, effectively solving the above problems.
First, multi-time-step CT gas-water interface slice images are input into a 3D CNN convolutional neural network to extract 3D feature maps. This network strengthens the 3D texture and morphological features of the gas-water interface through multiple convolution operations, providing high-recognition basic features for subsequent matching.
Next, the 3D feature maps are input into a 3D self-attention and cross-attention module. Let the reference time-step feature be Si and the target time-step feature be Sj. The self-attention mechanism within the module captures the global correlations within the same feature map, and the cross-attention mechanism builds the feature associations between Si and Sj, calculating the matching relationship Si/Sj=di/dj. Finally, the CT gas-water interface depth feature map is output, and the coarse matching associations of the feature points are preliminarily determined.
Finally, a feature-point-guided aggregation module optimizes the matching results. This module includes a cross-attention module and a linear cross-attention module. By combining upsampling fusion and downsampling fusion operations, multi-scale feature aggregation is performed on the 3D feature map. The above module structure is repeated four times to fully integrate the interface features at different scales, ultimately generating a CT gas-water interface feature point matching matrix of size H/8×W/8. This process, through multi-scale aggregation and attention correlation, balances local feature consistency and global correlation information, significantly improving the accuracy and stability of the matching results, and effectively reducing the interference of scale differences and noise in the matching process. Figure 3 shows the Transformer modeling method architecture for CT gas-water interface feature point matching.
Figure 3. Transformer modeling method architecture for CT gas-water interface feature point matching
2.2.3 Three-dimensional point cloud reconstruction and optimization
The accurately matched feature point pairs provide reliable two-dimensional correspondences for three-dimensional reconstruction. Based on camera calibration parameters and the triangulation principle, the 3D point cloud of the gas-water interface can be generated. Subsequent denoising and smoothing optimization further enhance the point cloud quality, providing high-precision data support for the subsequent morphological quantification.
The 3D point cloud is generated by calculating the 3D coordinates of the feature points based on the triangulation principle. The camera intrinsic matrix K and extrinsic matrix [R|t] are obtained through preliminary camera calibration, where R is the rotation matrix, and t is the translation vector. For the accurately matched points (u1,v1) and (u2,v2) in the reference and target images, a linear system of equations is constructed:
$\left\{\begin{array}{l}s_1\left[\begin{array}{c}u_1 \\ v_1 \\ 1\end{array}\right]=K[I \mid 0] P \\ s_2\left[\begin{array}{c}u_1 \\ v_1 \\ 1\end{array}\right]=K[R \mid t] P\end{array}\right.$ (4)
where, s1 and s2 are scale factors, and P is the 3D point coordinates to be solved. Solving this system of equations gives the 3D coordinates of a single feature point. This operation is repeated for all CT slices across all time steps, and the 3D coordinates of all feature points are integrated to generate the complete 3D point cloud of the gas-water interface, Pcloud∈RM×3, where M is the total number of points in the point cloud.
Point cloud optimization is performed step-by-step using statistical filtering and moving least squares, balancing denoising and detail preservation. The statistical filtering sets the number of neighboring points (k=20), calculates the mean and standard deviation of the distances between each point and its neighboring points, and removes outlier points with a distance greater than the mean plus 2 times the standard deviation. This parameter setting effectively removes isolated points caused by scan noise and matching errors. The moving least squares method smooths the point cloud while preserving subtle morphological features of the interface. By constructing a locally weighted polynomial surface to fit the point cloud data, local deviations in the point cloud are corrected. The optimized point cloud maintains good smoothness while accurately restoring the true geometric morphology of the gas-water interface, providing high-quality foundational data for subsequent morphological quantification analysis.
2.3 Morphological quantification index system for CT gas-water interface
Based on the high-precision three-dimensional point cloud reconstructed in Section 2.2, a “geometric feature-dynamic feature” dual-dimensional morphological quantification index system is constructed. Geometric features focus on the static spatial morphology of the interface, characterizing the distribution range, occupied volume, and morphological smoothness of the interface. Dynamic features correlate point cloud data from different time steps to reveal the evolutionary patterns of the interface during the displacement process. The two types of indicators complement each other, providing an objective basis for the quantification analysis of the gas-water process, displacement efficiency evaluation, and stability judgment.
2.3.1 Geometric feature indicators
Geometric feature indicators are used to precisely characterize the static spatial morphology of the gas-water interface. Three core indicators are selected: interface surface area, interface enclosing volume, and average curvature. These indicators cover key geometric information from three dimensions: distribution range, gas-phase occupied space, and morphological smoothness. The calculation process is based on the reconstructed three-dimensional point cloud, ensuring the objectivity and accuracy of the indicators.
Interface Surface Area: This is the core indicator for representing the spatial distribution of the interface. The calculation first converts the discrete point cloud into a continuous surface. The Poisson surface reconstruction algorithm is used to fit the three-dimensional point cloud. This algorithm solves the Poisson equation to construct a triangular mesh that fits the topology of the point cloud, effectively preserving the subtle morphological features of the interface. For the generated triangular mesh, the area of each triangle is calculated and summed to obtain the total interface surface area S. This indicator directly reflects the contact range between the gas-water interface and the water phase. The larger the surface area, the broader the gas-water exchange interface, providing basic data for subsequent analysis of mass transfer efficiency. Interface Enclosing Volume: This is used to quantify the spatial size occupied by the gas phase within the core sample. The Axis-Aligned Bounding Box (AABB) method is used to calculate the volume. All three-dimensional coordinates of the point cloud are traversed to extract the maximum and minimum values along the x, y, and z axes, and a rectangular bounding box is constructed based on these extreme values. The volume V of this box is the interface enclosing volume. This method is computationally efficient and reliable in precision, providing a clear reflection of the gas phase's spatial occupancy ability during the displacement process.
Average Curvature: This focuses on the morphological smoothness and convex-concave characteristics of the interface and is a key indicator for revealing the degree of influence of the core pore structure on the interface. The calculation process is based on a triangular mesh: for each vertex in the mesh, the two principal curvatures k1 and k2 are calculated based on the geometric relationship with its neighboring vertices. The vertex curvature Hi is the average of the two principal curvatures: Hi=(k1+k2)/2. The arithmetic mean of all vertex curvatures gives the average curvature Havg. The principal curvature is calculated based on quadratic surface fitting to ensure the accuracy of the curvature calculation. The sign of Havg represents the overall convex or concave shape of the interface: positive values indicate the interface is convex, while negative values indicate the interface is concave. The absolute value reflects the smoothness of the interface: smaller absolute values indicate a smoother interface, while larger values suggest significant fluctuations due to pore throat blockages.
2.3.2 Dynamic feature indicators
Dynamic feature indicators are used to correlate three-dimensional point cloud data from different time steps, characterizing the evolution patterns of the gas-water interface over time. Two indicators are selected: interface migration rate and interface fluctuation amplitude. These indicators quantify the dynamic characteristics of the displacement process from the perspectives of macroscopic advancement efficiency and morphological stability, providing quantitative support for evaluating displacement effects and optimizing displacement parameters.
Interface Migration Rate: This indicator is used to characterize the overall advancing speed of the gas-water interface and is the core dynamic indicator for reflecting displacement efficiency. Before calculation, the overall spatial position of the interface at each time step must be determined. The centroid coordinates are used as the representative position of the interface. For the point cloud at time t, the arithmetic mean of all three-dimensional coordinates of the points is calculated to obtain the centroid G(x,y,z) of the interface at that time. Let the centroids at time t1 and t2 be G1(x1,y1,z1) and G2(x2,y2,z2), respectively. The interface migration rate v is calculated by the formula:
$v=\frac{\sqrt{(x 2-x 1)^2+(y 2-y 1)^2+(z 2-z 1)^2}}{t 2-t 1}$ (5)
This indicator, through the ratio of the centroid distance to the time difference, eliminates the interference of local fluctuations on the overall advancement speed and can accurately reflect the macroscopic advancement efficiency of the interface under displacement pressure. The magnitude of this value is directly related to key parameters such as displacement pressure and core permeability.
Interface Fluctuation Amplitude: This indicator is used to evaluate the stability of the interface morphology during the displacement process, focusing on the fluctuation degree of the interface in the displacement direction. The preset displacement direction in the gas-water experiment is the z-axis. The fluctuation of the interface along this direction directly reflects morphological stability. Therefore, the z-direction is selected as the dimension for calculating the fluctuation amplitude. For the point cloud at time t, the coordinates of all points along the z-direction are extracted, and the maximum value zmax and minimum value zmin are determined. The difference between these two values is the interface fluctuation amplitude Δh=zmax-zmin. The physical meaning of this indicator is clear: the smaller Δh is, the smoother the interface in the displacement direction, the more stable the morphology, and the more uniform the displacement process. If Δh increases, it indicates that the interface is influenced by core heterogeneity, pore throat blockages, or other factors, leading to obvious "finger" or local fluctuations and reduced stability. By tracking the changes in Δh at different time steps, the stability evolution of the displacement process can be dynamically monitored.
To quantify the performance gain of the proposed segmentation method in the "pixel overlap - boundary refinement" dimension, a multi-method comparison experiment was conducted. As seen in Table 1, the Dice coefficient of the proposed method reaches 0.92, which is an improvement of 5.7% over the original 3D Attention U-Net. This gain is attributed to the PAM, which enhances the feature distinction between the gas-water interface and background pores, reducing the problem of "incorrect classification of boundary pixels" in traditional methods. The IoU increases to 0.86, indicating that the segmentation result is better at "retaining the valid interface area and removing background noise," which is directly related to the stabilization of feature maps through the BN layers in the 3D convolutional block.
Table 1. Comparison of quantitative performance metrics for different segmentation methods
|
Segmentation Method |
Dice Coefficient |
IoU (Intersection over Union) |
Pixel Accuracy (PA) |
Average Surface Distance (ASD, μm) |
HD, μm |
|
Improved 3D Attention U-Net |
0.92 |
0.86 |
0.97 |
2.1 |
5.3 |
|
Original 3D Attention U-Net |
0.87 |
0.78 |
0.94 |
3.5 |
8.1 |
|
Traditional 3D U-Net |
0.82 |
0.72 |
0.91 |
4.8 |
10.5 |
|
Threshold Segmentation (Otsu Method) |
0.71 |
0.58 |
0.85 |
7.2 |
15.3 |
Figure 4. Morphological semi-variance function curves of CT gas-water interface segmentation results and measured profiles in X, Y, and Z directions
In terms of boundary precision, the mean surface distance is only 2.1 μm, which is 43.8% of the traditional 3D U-Net and 29.2% of the threshold segmentation; the HD, representing the maximum deviation of the interface boundary, is reduced to 5.3 μm, much lower than the original method's 8.1 μm. The core of this improvement lies in the accurate localization of the interface boundary by the attention gate module: traditional methods are prone to boundary blurring due to the inhomogeneity of CT image grayscale, while the proposed method achieves sub-micron level restoration of boundary pixels through multi-dimensional attention weights.
To quantify the restoration accuracy of the proposed 3D deep segmentation method for the gas-water interface spatial morphology, the morphological semi-variance function was introduced to compare the structural consistency between the segmentation result and the measured profile in 3D space. From the directional curve features in Figure 4, it can be observed that: in the X direction, the semi-variance rapidly increases in the 0-50 lag distance range, reflecting significant local morphological variation of the interface in short distances. At this point, the segmentation curve almost coincides with the measured profile curve, indicating that the method accurately captures the fine local morphology formed by the interface constrained by the core pore throat. In the 50-250 range, the curve gradually stabilizes, with the semi-variance fluctuation range of the segmentation result being narrow and always enclosing the measured curve, indicating that the restoration error for the macroscopic spatial continuity of the interface is controllable. In the Y direction, the semi-variance growth slope in the 0-200 range is exactly the same as the measured profile, and the slight inflection point of the measured curve at 200 is also synchronized in the segmentation result, demonstrating the method’s ability to recognize the non-uniform morphology of the interface. In the 200-500 range, the baseline deviation between the two is less than 10%, proving that the matching degree of the macroscopic spatial distribution of the interface is high. The Z direction is the displacement direction of the gas-water experiment. The semi-variance quickly saturates in the short lag distance, and the saturation point of the segmentation result coincides with the measured profile. The curve crossover phenomenon at the lag distance of 30 is exactly the same, showing that the method can accurately restore the dynamic morphological features of the interface in the displacement direction. In conclusion, the proposed 3D deep segmentation method achieves high consistency with the measured interface in three dimensions: fine local morphology, macroscopic spatial continuity, and dynamic features in the displacement direction. The output interface morphology data can directly support the analysis of the correlation mechanism between "interface morphology" and "displacement efficiency" in the gas-water process.
To quantitatively assess the system error of the proposed 3D deep segmentation method in the morphological parameter quantification of the gas-water interface, three core indicators are selected: the convex/concave surface ratio, the high/medium/low curvature region ratio, and the fast advancing region ratio. The statistical distribution differences between the segmentation result and the measured interface are compared. From the macroscopic topological morphology in Figure 5, it can be seen that the convex and concave surface ratios are key indicators for representing the expansion mode of the gas phase in the core pore space. The convex surface ratio in the segmentation result is about 42%, which deviates only 2% from the measured interface at 43%. The concave surface ratio is about 23%, which deviates by only 1% from the measured value at 22%. This high consistency indicates that the method can accurately reproduce the topological pattern of "radial expansion dominated by the convex surface and filling local pores by the concave surface." The accurate quantification of the convex surface ratio is the core basis for subsequent calculation of gas-phase encroachment volume, and the deviation is controlled within 2%, which limits the calculation error of the encroachment volume to less than 5%. From the perspective of microscopic morphological heterogeneity, the high, medium, and low curvature region ratios correspond to different morphological scales of the interface constrained by the pore throat: the high curvature region ratio in the segmentation result is about 12%, deviating by only 0.8% from the measured value at 13%. These regions correspond to the sharp morphological changes of the interface at pore throats, and their quantification accuracy directly determines the reliability of "interface morphology-fluid permeability resistance" correlation analysis. The deviations for the medium and low curvature regions are 1.2% and 1.5%, respectively, further proving the stability of the method’s quantification at different morphological scales, avoiding the over-smoothing of flat interface morphology seen in traditional segmentation methods. Regarding dynamic displacement features, the fast advancing region ratio reflects the intensity of the "fingering" phenomenon of the interface: the ratio in the segmentation result is about 5%, deviating by only 1% from the measured value at 6%, and both fall in the low ratio range, indicating that the method can accurately recognize the weak "fingering" characteristics at the early stage of displacement. The reliability of this result is the premise for subsequently adjusting the displacement pressure to suppress "fingering," and the deviation of less than 1% ensures the accuracy of displacement parameter optimization. Analyzing the source of errors, the deviations for all indicators are far smaller than the inherent system error of the core CT scan, which is about 5%, indicating that the quantification error of the method itself can be ignored.
Figure 5. Statistical distribution comparison of gas-water interface morphological parameters
Table 2. Geometric accuracy verification of 3D reconstruction results
|
Indicator |
Proposed Method (Based on Improved 3D Attention U-Net Segmentation) |
Reconstruction Based on Traditional 3D U-Net Segmentation |
Reconstruction Based on Threshold Segmentation |
Measured Value Reference Range |
|
Point Cloud RMSE (mm) |
0.021 |
0.045 |
0.082 |
<0.03mm |
|
Point Cloud Registration Error (mm) |
0.018 |
0.039 |
0.075 |
<0.02mm |
|
Surface Area Relative Deviation (%) |
2.3 |
6.8 |
12.5 |
<5% |
|
Enclosing Volume Relative Deviation (%) |
1.8 |
5.7 |
10.2 |
<4% |
|
Average Curvature Relative Deviation (%) |
3.1 |
7.2 |
14.6 |
<6% |
Table 3. Dynamic evolution of morphological quantification indicators at different displacement time steps
|
Displacement Time Step |
Surface Area (mm²) |
Enclosing Volume (mm³) |
Average Curvature (mm⁻¹) |
Interface Migration Rate (mm/s) |
Interface Fluctuation Amplitude (mm) |
|
0 |
12.5 |
3.2 |
0.18 |
- |
0.4 |
|
20 |
28.7 |
7.6 |
0.25 |
0.008 |
0.8 |
|
40 |
45.3 |
12.1 |
0.32 |
0.012 |
1.2 |
|
60 |
61.8 |
16.7 |
0.29 |
0.009 |
1.0 |
|
80 |
75.2 |
20.3 |
0.24 |
0.007 |
0.7 |
|
100 |
82.6 |
22.5 |
0.21 |
0.005 |
0.5 |
To verify the geometric fidelity of the 3D reconstruction results, the reconstruction metrics supported by different segmentation methods were compared with the measured reference range. As seen in Table 2, the point cloud RMSE is 0.021 mm, which meets the measured reference range (< 0.03 mm), while the reconstruction based on traditional 3D U-Net exceeds this range. This shows that the segmentation result in the proposed method has high boundary accuracy, making the point cloud sampling of the interface profile closer to the true shape and avoiding "point cloud shift due to blurred boundaries" in traditional segmentation. In terms of morphological parameter quantification accuracy, the surface area relative deviation is only 2.3%, and this deviation corresponds to a gas-phase encroachment volume calculation error of about 4.8%, which is far below the industry standard upper limit of 10%. The relative deviation of the enclosing volume is 1.8%, indicating that the proposed method’s quantification of the gas phase occupying space is highly consistent with the measured value, which is the core basis for the subsequent "displacement pressure-gas-phase volume" correlation analysis. The relative deviation of average curvature is 3.1%, which is particularly notable. Curvature is a key indicator for characterizing the interaction between the interface and pore throat. A 7.2% deviation in traditional methods can lead to a permeability resistance calculation error exceeding 15%, while the proposed method controls the deviation to within 3%, ensuring the quantitative reliability of the "morphological features-permeability behavior" correlation analysis.
To reveal the coupling mechanism between the gas-water interface morphology and the displacement process, the evolution of key quantification indicators at different time steps was statistically analyzed. As shown in Table 3, the surface area in the early stage of displacement increases from 12.5 mm² to 45.3 mm², with the enclosing volume expanding accordingly, corresponding to the rapid radial expansion of the gas phase in high-permeability pores. The average curvature increases to 0.32 mm⁻¹ and the fluctuation amplitude increases to 1.2 mm, reflecting the steep deformation of the interface when passing through narrow pore throats. At this stage, the migration rate increases to 0.012 mm/s, directly reflecting the effective transmission of the displacement pressure to high-permeability flow channels. In the middle stage of displacement, the average curvature drops to 0.24 mm⁻¹ and the fluctuation amplitude decreases to 0.7 mm, indicating that the gas phase begins to fill medium- and low-permeability pores, and the interface fluctuation becomes smoother due to the "filling-flattening" process of the pore space. The migration rate slows down to 0.007 mm/s, corresponding to the increased loss of displacement pressure in the complex pore network. This stage of morphological change is the result of the coupling between the "expansion of the gas phase" and the "increase in flow resistance." The quantification accuracy of the proposed method ensures the observability of this coupling relationship. In the later stage of displacement, the growth rate of surface area and volume significantly decreases, and the average curvature and fluctuation amplitude return to values near the initial level, indicating that the gas-phase encroachment volume tends to saturate and the interface morphology reaches a stable state.
The academic value of these evolutionary features lies in the fact that the proposed quantification indicators can precisely capture the "expansion-filling-stabilization" three-stage morphological response in the displacement process. These indicators provide continuous and reliable morphological input for establishing a quantitative model of the "dynamic evolution of interface morphology-displacement efficiency spatial-temporal distribution."
This paper proposes an integrated method for 3D segmentation, reconstruction, and morphological quantification of CT gas-water interfaces based on an improved 3D Attention U-Net, which has been validated with a real core CT dataset and shown significant effectiveness. The core performance of the proposed method, verified with real core CT gas-water data, is as follows: in segmentation, the Dice coefficient of the improved 3D Attention U-Net reaches 0.92, improving by 5.7% compared to the original model, IoU is 0.86, the average surface distance is 2.1 μm, and the HD is 5.3 μm, which is a reduction of 56.2% and 49.5% respectively compared to the traditional 3D U-Net, achieving pixel-level overlap and sub-micron-level boundary restoration; in morphological restoration, the segmentation results are highly consistent with the measured profiles in the X/Y/Z direction semi-variance curves, with local morphological coincidence in the X direction, inflection point reproduction in the Y direction with base value deviation <10%, and precise matching of displacement features in the Z direction; in 3D reconstruction, the point cloud RMSE is only 0.021mm, with relative deviations of surface area, enclosing volume, and average curvature of 2.3%, 1.8%, and 3.1%, respectively, all meeting experimental precision requirements; in quantitative evolution, the core morphological parameters have a deviation of less than 2% compared to the measured values, which can accurately capture the evolution of the three-stage displacement indicators of "expansion-filling-stabilization," providing high-precision data support for the analysis of the correlation mechanism between "interface morphology" and "displacement efficiency."
This research has certain limitations: the dataset only covers specific core types and displacement conditions, and the generalization to high-noise images and multiphase mixed interface scenarios has not been fully validated. The model processing time for batch data is relatively long, making it difficult to meet real-time analysis requirements. Future research can be deepened in three areas: expanding the dataset to include different lithologies, permeabilities, and displacement pressures to improve method robustness; optimizing processing efficiency through lightweight networks and parallel computing; and integrating pore-scale flow simulation to build a "segmentation-quantification-mechanism" model to further reveal the intrinsic coupling mechanism between interface evolution and displacement efficiency.
This paper was supported by the Central Leading Local Science and Technology Development Fund Program (Grant No.: 254Z5401G); and Central Leading Local Science and Technology Development Fund Program (Grant No.: 246Z0802G).
[1] Arbabi, F., Bazylak, A. (2023). Impact of wettability on immiscible displacement in water saturated thin porous media. Physics of Fluids, 35(5): 053321. https://doi.org/10.1063/5.0144987
[2] Alhosani, A., Scanziani, A., Lin, Q., Selem, A., Pan, Z., Blunt, M.J., Bijeljic, B. (2020). Three-phase flow displacement dynamics and Haines jumps in a hydrophobic porous medium. Proceedings of the Royal Society A, 476(2244): 20200671. https://doi.org/10.1098/rspa.2020.0671
[3] Hasnain, J., Satti, H.G., Sheikh, M., Abbas, Z. (2023). Study of double slip boundary condition on the oscillatory flow of dusty ferrofluid confined in a permeable channel. Fluid Mechanics and Its Applications, 21(4): 671-684. https://doi.org/10.22190/FUME211228019H
[4] Zhang, H., Feng, J.C., Wang, B., Shen, Y., Zhang, Y., Zhang, S. (2025). Micro-CT insights into morphological evolution and kinetics of hydrate phase transitions at the gas-liquid interface. Gas Science and Engineering, 144: 205740. https://doi.org/10.1016/j.jgsce.2025.205740
[5] Omosebi, O.A., Tokunaga, T.K. (2023). Simplified scaling relations for the depth-dependence and IFT reduction of fluid imbibition in gas-saturated reservoir rocks. Gas Science and Engineering, 114: 204973. https://doi.org/10.1016/j.jgsce.2023.204973
[6] Zha, W., Lin, B., Liu, T., Liu, T., Yang, W., Wang, W. (2025). Influence of coal micropore network on gas–liquid two-phase transport. Energy & Fuels, 39(18): 8423-8434. https://doi.org/10.1021/acs.energyfuels.5c01067
[7] Cheng, Z., Wang, J. (2020). Improved region growing method for image segmentation of three-phase materials. Powder Technology, 368: 80-89. https://doi.org/10.1016/j.powtec.2020.04.032
[8] Osman, A.F., Tamam, N.M., Yousif, Y.A. (2023). A comparative study of deep learning‐based knowledge-based planning methods for 3D dose distribution prediction of head and neck. Journal of Applied Clinical Medical Physics, 24(9): e14015. https://doi.org/10.1002/acm2.14015
[9] Klarenberg, R., Bakx, N.L., Hurkmans, C.W. (2025). A comparative analysis of deep learning architectures with data augmentation and multichannel input for locoregional breast cancer radiotherapy. Journal of Applied Clinical Medical Physics, 26(6): e70047. https://doi.org/10.1002/acm2.70047
[10] Amira, H.F., Lilia, K., Nesrine, M., Jihene, M. (2024). Design of corner detection system based on FPGA. Traitement du Signal, 41(6): 3203-3211. https://doi.org/10.18280/ts.410636
[11] Sun, X.B., Yang, X.Q., Liang, J.H. (2023). Calibration method of feature point layout in prefabricated buildings based on image recognition technology. Traitement du Signal, 40(1): 167-174. https://doi.org/10.18280/ts.400115
[12] Heinrich, A., Hubig, M., Teichgräber, U., Mall, G. (2025). Automated identification of unknown decedents: Matching postmortem CT images with clinical databases. International Journal of Legal Medicine, 139(5): 2251-2262. https://doi.org/10.1007/s00414-025-03528-9
[13] Lee, H., Lee, J., Kim, N., Kim, S.J., Shin, Y.G. (2008). Robust feature-based registration using a Gaussian-weighted distance map and brain feature points for brain PET/CT images. Computers in Biology and Medicine, 38(9): 945-961. https://doi.org/10.1016/j.compbiomed.2008.04.001
[14] Audibert, E., Lebas, B., Spriet, C., Habrant, A., Chabbert, B., Paës, G. (2023). Automated quantification of fluorescence and morphological changes in pretreated wood cells by fluorescence macroscopy. Plant Methods, 19(1): 16. https://doi.org/10.1186/s13007-023-00991-6
[15] Ghavami, S., Bayat, M., Fatemi, M., Alizad, A. (2020). Quantification of morphological features in non-contrast-enhanced ultrasound microvasculature imaging. IEEE Access, 8: 18925-18937. https://doi.org/10.1109/ACCESS.2020.2968292
[16] Gangaiah, V., Adarakatti, P.S., Siddaramanna, A., Malingappa, P., Chandrappa, G.T. (2017). Studies on phase and morphological evolution of silver vanadium oxides as a function of pH: Evaluation of electrochemical behavior towards quantification of Pb2+ and Cd2+ ions. Materials Research Express, 4(8): 085039. https://doi.org/10.1088/2053-1591/aa851a
[17] Davarpanah, M., Sanaye-Pasand, M. (2013). Improved gapped-core CT dimensioning algorithm considering relay and system requirements. IEEE Transactions on Power Delivery, 28(2): 788-796. https://doi.org/10.1109/TPWRD.2012.2234485
[18] Yan, Y.T., Chua, S., DeCarlo, T.M., Kempf, P., Morgan, K.M., Switzer, A.D. (2021). Core-CT: A MATLAB application for the quantitative analysis of sediment and coral cores from X-ray computed tomography (CT). Computers & Geosciences, 156: 104871. https://doi.org/10.1016/j.cageo.2021.104871