Application of Image Processing and Identification Technology for Digital Archive Information Management

Application of Image Processing and Identification Technology for Digital Archive Information Management

Zhen Zhang Xiang Xie

School of Economics and Management, Beijing Jiaotong University, Beijing 100044, China

Beijing Institute of Ecological Geology, Beijing 100120, China

Corresponding Author Email: 
xxie@bjtu.edu.cn
Page: 
145-152
|
DOI: 
https://doi.org/10.18280/ts.390114
Received: 
18 October 2021
|
Revised: 
16 December 2021
|
Accepted: 
25 December 2021
|
Available online: 
28 February 2022
| Citation

© 2022 IIETA. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).

OPEN ACCESS

Abstract: 

It is difficult to manually build a digital management system for archive information. This paper explores the application of image processing and identification technology for digital archive information management, trying to provide convenient and swift services to archive mangers and archive information requestors, and to strengthen the standardized management of archive information. Firstly, the authors summarized the difficulties in digitalizing paper archives, and explained how to correct the tilt of digital archives. Next, the adaptive histogram equalization was improved to realize high-quality digitalization of archive images. In addition, the image processing flow was explained for the digitalization of archive images, and the horizontal projection histogram was adopted to quickly extract and detect the archive image texts being digitalized. The proposed image processing approach was proved effective through experiments.

Keywords: 

digital archive, archive information management, image processing

1. Introduction

The digitalization of the archive information management system aims to summarize, process, and analyze various archive information resources. The workload of the digitalization process is too huge to be completed manually [1-6]. The overall goal of the system digitalization is to build a broad computer network that applies to management department at all levels, forming a complete transmission channel for the information about paper archives and digital archives [7-17]. On this basis, it is necessary to develop the application system software for digital archive management at multiple levels, and construct a largescale digital archive database, in order to realize computerized, networked queries, calls, and utilization of archives [18-23]. The ultimate goal is to provide convenient and swift services to archive mangers and archive information requestors, and to strengthen the standardized management of archive information.

The existing intelligent identification methods are generally slow in identification, owing to the fuzziness of digital archives. Based on image recognition technology, Zeng et al. [24] designed an intelligent identification method for digital archives. The image recognition technology was adopted to set up a single identification layer for power supplies, and look for the port mapping table. Then, an intelligent identification model was established, drawing on the attributes of the linear summation function. Soh et al. [25] put forward an intelligent image analysis strategy for automatic detection of poems in digital archives of newspapers, and integrates computer vision into the application “Image Analysis for Archival Discovery (Aida)”. The visual cues were captured according to the visual structure of poems. Then, an artificial neural network was trained through machine learning, and used to determine whether an image contains poem texts. Erickson et al. [26] illustrated the image archiving problem, and attempted to define the outcome of corporate digital image archiving requirements. The determined problems include the cost and complex interfaces of multiple archives, the difference between data processing strategies, the resulting variation in data completeness, and the variation in non-image data. Merzlyakov et al. [27] developed a strategy to convert the traditional archives into digital archives. The strategy applies to large archives, which include documents containing basic graphs, and turns the documents into images. A detailed discussion was held on the digitalization, storage, and display techniques for multi-resolution images. The main structural components of digital archives are the relational database and the image library. The two components are physically separated, but logically intertwined. These components form a three-layer distributed architecture, including primary archives, regional replicas, and various auxiliary archives.

To sum up, the domestic research on digital archive information management is still in the primary stage. Most scholars have only scanned paper archives, and manually numbered the documents, failing to further process the scanned images. In the era of digitalization, the simple processing of archive images may face several problems: the slow processing speed, and the poor continuity of strokes. Therefore, this paper explores the application of image processing and identification technology for digital archive information management. The main contents are as follows: (1) Displaying the difficulties in digitalizing paper archives, and providing a tilt correction approach for digital archives; (2) Improving the adaptive histogram equalization, and effectively enhancing the digitalization quality of archive images; (3) Explaining the image processing flow for the digitalization of archive images, and adopting the horizontal projection histogram to quickly extract and detect the archive image texts being digitalized. The proposed image processing approach was proved effective through experiments.

2. Tilt Correction

Figure 1. Flow of paper archive digitalization

Digitalization of paper archives aims to transform paper archives into digital archives with texts and images as the basic units, and back up paper archives for long-term storage. Figure 1 shows the flow of paper archive digitalization. It can be seen that the main links of paper archive digitalization generally include sorting out paper archives, scanning paper archives, digitalizing paper archives, ranking and indexing paper archives, initial quality check of paper archives, and second quality check of paper archives, etc.

Figure 2. Difficulties in paper archive digitalization

Figure 2 displays the difficulties in paper archive digitalization, such as complex image structure, uneven illumination, changing tilt angles, low resolution, and fuzzy local details. The image tilt is a common problem in digital archives, owing to the manual operations and variation in processing devices during the digitalization of paper archives. Therefore, the first step of paper archive digitalization is tilt correction.

The original tilt archive image g(a,b) is filtered by a Gaussian filter with the variance of p=φ2:

$v\left( a,b,{{p}_{l}} \right)=g\left( a,b \right)\circ {{H}_{{{p}_{l}}}}$            (1)

where, ° is the convolutional operation;

${{H}_{{{p}_{l}}}}=\frac{1}{2\pi {{p}_{l}}}exp\left\{ -\frac{\left( {{a}^{2}}+{{b}^{2}} \right)}{2{{p}_{l}}} \right\}$                (2)

Let v(a,b,pl) be the processed archive image, where l=1,2,...,m correspond to the m scales pl of the archive image.

If the row structure information is ideal, the archive image can be obtained by controlling the image scale pl. The greater the pl value, the fuzzier the archive image v(a,b,pl). The pl is optimal, when the row structure information is ideal.

The gradient vector of archive image g(a,b) is denoted by (ga,gb)T=(∂g/∂a,∂g/∂b)T, which characterizes the row structure and direction information of the image. If a pixel in the archive image has an ideal row structure, the gradient vector direction of the pixel must be consistent in its neighborhood. Let H(a,b,σ2) be a Gaussian function with the variance of σ2. Then, the row structure analysis tensor in the neighborhood of the pixel can be defined by gradient functions ga and gb:

$\begin{align}  & S{{R}_{\sigma }}\left( g\left( a,b \right) \right) \\ & =\left( \begin{matrix}   H\left( a,b,{{\sigma }^{2}} \right)*{{\left( {{g}_{a}} \right)}^{2}} & H\left( a,b,{{\sigma }^{2}} \right)*{{g}_{a}}{{g}_{b}}  \\   H\left( a,b,{{\sigma }^{2}} \right)*{{g}_{a}}{{g}_{b}} & H\left( a,b,{{\sigma }^{2}} \right)*{{\left( {{g}_{b}} \right)}^{2}}  \\  \end{matrix} \right) \\   \end{align}$                                (3)

In the Gaussian function, the σ value determines the size of pixel neighborhood. Let μ1 and μ2 be the eigenvalues of the row structure analysis tensor. If μ1 is far greater than μ2, the direction distributions of gradient vectors are completely different. That is, a row structure is very possible to exist at pixel (a,b) of the archive image. The possibility can be quantified by linear likelihood R(a,b):

$R\left( a,b \right)=\frac{{{\mu }_{1}}-{{\mu }_{2}}}{{{\mu }_{1}}+{{\mu }_{2}}}$             (4)

Formula (4) shows that R(a,b) falls in [0,1]. If R(a,b) approaches 1, it is very likely for a row structure to exist at pixel (a,b) and its neighborhood; If R(a,b) approaches 0, it is very unlikely for a row structure to exist at pixel (a,b) and its neighborhood. In general, the probability for a row structure to exist at pixel (a,b) and its neighborhood is linearly proportional to linear likelihood R(a,b).

Let u1 and u2 be the eigenvectors of eigenvalues μ1 and μ2, respectively. If R(a,b) approaches 1, the directions of u1 and u2 are perpendicular and parallel to the row structure direction at pixel (a,b), respectively. Hence, the direction of eigenvector u2 at pixel (a,b) is the row structure direction of the archive image. Let u2(a,b,1) and u2(a,b,2) be the components of eigenvector u2 in directions a and b, respectively. Then, the tilt angle of a pixel can be calculated by:

$RSD\left( a,b \right)=arctg\frac{{{u}_{2}}\left( a,b,2 \right)}{{{u}_{2}}\left( a,b,1 \right)}$                 (5)

After defining and analyzing the row structure of the archive image, the proposed tilt detection algorithm for archive images can be summarized as follows: Firstly, the pixels g(a,b) of the original archive image are filtered to obtain the pixels of the filtered archive image h(a,b,pl), l=1, 2,…,m. Then, the row structure analysis tensor SRσl(a,b) of the archive image is calculated based on h(a,b,pl), where l=1,2,…,m, and σl=2pl1/2. Next, the eigenvalues μl1 and μl2, eigenvectors ul1 and ul2, and row structure likelihood R(s,B;σl) of each pixel in the archive image are calculated, with μl1μl2, and Ropt(a,b)=maxl(1,2,...,m)R(a,B;σl). After that, a threshold e is defined for the row structure likelihood, and the tilt angle RSD(a,b) of any pixel with Ropt(a,b)>e is calculated. The above steps are implemented iteratively until all pixels are handled. Otherwise, the histogram of RSD(a,b) is calculated, with RSD(a,b) being the angle corresponding to the peak of the histogram.

3. Enhancement of Digital Archives

During the digitalization, the paper archive images are disturbed by noises. Relevant measures must be taken to suppress the noises. For an archive image, a small amplification factor is suitable to enhance image details (e.g., signature, and seal) in areas with a large gradient; a large amplification factor is suitable to smooth the enhanced details in areas with a small gradient. To achieve both goals, this paper improves the adaptive histogram equalization, and improves the digitalization quality of archive images.

The improvement of the adaptive histogram equalization intends to enhance the details of the archive image. Let hab and h'ab be the grayscales of image pixels before and after equalization, respectively; nab be the mean grayscale of the neighborhood centering at pixel hab; Ω(*) be the histogram equalization operation based on the sliding window. Then, the relevant calculation formula can be expressed as:

$h_{ab}^{'}=\left\{ \begin{align}  & \Omega \left( {{h}_{ab}} \right)+l\left( {{h}_{ab}}-{{n}_{ab}} \right),0\le h_{ab}^{'}\le 255 \\ & \Omega \left( {{h}_{ab}} \right),otherwise \\   \end{align} \right.$                             (6)

The Ω(*) operation can adjust the dynamic range of the grayscale of image pixels. As a high-pass filter, l(hab- nab) can enhance the local details of the archive image. The l value should be selected under the following two conditions: (1) If hab is outside the detail area of the archive image, l approaches zero; (2) otherwise, the l value is large. Thus, the calculation formula (6) strikes a balance between local details and global grayscale adjustment of the archive image. Let ε2ab and ε2m be the noise variances in the neighborhood and in the entire image, respectively; l' be the scale coefficient. Then, the l value can be defined as:

$l=l'\left( \frac{\varepsilon _{ab}^{2}}{\varepsilon _{m}^{2}}-1 \right)$              (7)

The above analysis shows that, when ε2ab is close to ε2m, there is no detail in the current area of the archive image. In this case, l approaches zero, and it only needs to implement Ω(*) operation on hab. As ε2ab gradually surpasses ε2m, the l value is on the rise, calling for special attention to the details in the current area. In this way, the l value, which adapts to the features of the archive image, is utilized to enhance image details, while suppressing noises.

The noises with an amplitude below the threshold gradient are ignored. Let ∇hab be the gradient at pixel (a,b) in the archive image; l and k be the detail determination parameter and detail amplification parameter in the neighborhood, respectively; τ be the threshold gradient. Then, formula (6) can be corrected as:

$h_{ab}^{'}=\left\{ \begin{align}  & \Omega \left( {{h}_{ab}} \right)+lk\left( {{h}_{ab}}-{{n}_{ab}} \right),\left| {{\nabla }_{ab}} \right|<\tau  \\ & \Omega \left( {{h}_{ab}} \right),\left| \nabla {{h}_{ab}} \right|\ge \tau  \\   \end{align} \right.$                    (8)

The detail amplification parameter k can be characterized by a negative correlation with pixel gradient:

$k=1+{{\delta }_{1}}{{d}^{-\left| \nabla {{h}_{ab}} \right|/{{\delta }_{2}}}}$                  (9)

(1) Initialize parameters τ, l', δ1 and δ2, and determine the size of the neighborhood.where, 1 ensures that k is greater than 1, such as to achieve the forward amplification of the details of the archive image; δ1 is the maximum multiples of contrast; δ2 is the attenuation speed of the amplification coefficient with the growing gradient. The algorithm flow is detailed as follows:

(2) Compute the adaptive histogram equalization image f(a,b) for the original archive image g(a,b), calculate the gradient image and its variance ε2m, and define the enhance image d(a,b) as equal to f(a,b).

(3) For any pixel hab in the original archive image, if τ is greater than the absolute gradient at hab in the gradient image, terminate the algorithm.

(4) Compute the variance ε2ab and mean nab of the neighborhood of hab, and then calculate l and k.

(5) Obtain the value in the detail amplification area ψ: ψ=lk(hab-nab).

(6) If 0≤f(a,b)+ψ≤ 255, then d(a,b)=f(a,b)+ψ.

(7) If all pixels in the archive image are processed, terminate the algorithm.

4. Text Detection of Digital Files

Figure 3 illustrates the image processing flow of archive image digitalization. In addition to tilt correction and image enhancement (both of which are specified in the preceding section), text detection is an important aspect for the application of image processing and detection technology. Since the archive image contains both texts and graphs, it is necessary to detect archive images by developing pertinent rules for archive image judgement and determination. Like English texts, Chinese texts carry inherent printing features. In Chinese archive images, the Chinese characters are usually displayed in printing form, which ensures the consistency of size, color, stroke width, and texture distribution. Therefore, this paper adopts the horizontal projection histogram to quickly extract and detect the archive image texts being digitalized. Following the preset rules for archive image judgement and determination, this approach can eliminate the areas, where the printed Chinese is clearly not a part of text, from the archive image. Let PE and PS be the end index and start index of a text row in the horizontal projection image, respectively. Then, we have:

$HPH=PE-PS>=20$       (10)

Let INS and IMS be the sizes of the input and processed archive images, respectively. To improve the effect of text extraction and detection, the input archive image larger than 700×900 can be configured by:

$IMS=\left\{ \begin{align}  & \text{70}0\times \text{90}0,INS>\text{70}0\times \text{90}0 \\ & INS,INS\le \text{70}0\times \text{90}0 \\  \end{align} \right\}$                 (11)

Figure 3. Image processing flow of archive image digitalization

Figure 4. Architecture of text detection model for sensitive digital archive images

Note: BiLSTM is short for bidirectional long short-term memory.

To classify the texts of sensitive digital archive images, this paper combines the BiLSTM with layered attention mechanism into a text detection model for sensitive digital archive images. The architecture of the model is given in Figure 4. Specifically, the BiLSTM is good at extracting the memory features of long-distance texts in archive images, while the layered attention mechanism reflects the importance of text vectors on different levels in archive images, including word-level texts, and sentence-level texts.

Let fip be the output vector of BiLSTM at time p; Qθ and OFθ be the weight and bias of the corresponding character, respectively; tanh( ) be the activation function; vθ be the word-level attention weight; αip be the normalized weight of the i-th character at time p; ri be the weighted sum (sentence) obtained through the dot product between αip and fip at time p. Then, the word-level attention mechanism of the archive image can be calculated by:

${{v}_{ip}}=tanh\left( {{Q}_{\theta }}{{f}_{ip}}+O{{F}_{\theta }} \right)$          (12)

${{\alpha }_{ip}}=\frac{exp\left( v_{ip}^{T}{{v}_{\theta }} \right)}{\sum\limits_{p}{exp\left( v_{ip}^{T}{{v}_{\theta }} \right)}}$               (13)

${{r}_{i}}=\sum\limits_{p}{{{\alpha }_{ip}}{{f}_{ip}}}$           (14)

Let fi be the sentence representation computed from Ri via the BiLSTM; QR and OFθ be the weight and bias of the sentence-level vector, respectively; vr be the sentence-level attention weight; αi be the normalized weight; u be the weighted sum (archive image) obtained through the dot product between fi and αi. Then, the sentence-level attention mechanism of the archive image can be calculated by:

${{v}_{i}}=tanh\left( {{Q}_{r}}{{f}_{i}}+O{{F}_{r}} \right)$           (15)

${{\alpha }_{i}}=\frac{exp\left( v_{p}^{T}{{v}_{r}} \right)}{\sum\limits_{p}{exp\left( v_{p}^{T}{{v}_{r}} \right)}}$                (16)

$u=\sum\limits_{p}{{{\alpha }_{i}}{{f}_{i}}}$          (17)

5. Experiments and Results Analysis

Our experiments were carried out on two self-designed sample libraries of archived newspapers and magazines (N-Ms). One of the libraries contains Chinese N-Ms and the other, English N-Ms. The tile angle of the archive images being tested was increased from -90° to 90°, at a step of 2°. The sample set contains a total of 1,549 archive images of Chinese and English N-Ms, which are horizontally positioned or with different tilt angles.

Figure 5 displays the original and enhanced Chinese N-M images. It can be seen that the enhanced archive image was sharper than the original image. In the enhanced image, the stroke edges of the printed characters were sharp. By contrast, those in the original image were relatively ambiguous. The proposed image enhancement method effectively enlarged the local details in the archive image, and smoothened the curve of enhanced details. Compared with the original image, the enhanced image had a wide range of grayscale, dark detail areas, rich image details, and sparse noises.

Figure 5. Original and enhanced Chinese N-M images

Figure 6. Original and enhanced English N-M images

Figure 6 displays the original and enhanced English N-M images. After tilt correction and image enhancement, the English N-M images were applied to the library document archive information system. As can be seen from Figure 6, the English N-M images were effectively enhanced by the filtering and linear transform of our algorithm.

To verify its processing effect of archive images, our algorithm was compared with the text detection method based on edge detection (Algorithm 1), that based on character structure features (Algorithm 2), and that based on corner detection (Algorithm 3). Table 1 presents the test performance indices of different algorithms. Table 2 compares the F-scores and time costs of these methods in text detection of Chinese and English N-M archives.

Table 1. Test performance indices of different algorithms

Method

Precision

Recall

F1-score

Algorithm 1

0.74

0.85

0.92

Algorithm 2

0.93

0.84

0.87

Algorithm 3

0.86

0.61

0.75

Our algorithm

0.97

0.94

0.98

Table 2. F-scores and time costs of text detection of Chinese and English N-M archives

Method

Chinese

English

F1 score

Time cost

F1 score

Time cost

Algorithm 1

0.72

10.95

0.69

13.71

Algorithm 2

0.84

12.68

0.77

14.56

Algorithm 3

0.71

10.26

0.68

17.28

Our algorithm

0.96

0.62

0.91

0.88

As shown in Tables 1 and 2, our algorithm has better precision, recall, and F1-score than the other text detection methods. The proposed horizontal projection histogram can detect and extract Chinese texts from archive images quickly and accurately, because it fully considers the features of printed Chinese characters.

Finally, the parameters of our text detection model for sensitive digital archive images were configured as follows: the activation function of the output layer, softmax function; dropout and recurrent dropout, 0.3; batch size, 128; learning rate, 0.001; attention dimensionality, 256; maximum number of iterations, 100. Word-class texts were constructed with Chinese words as the smallest units, and imported to the input layer of the model. Figure 7 presents the training effect of our text detection model on sensitive digital archive images. It can be seen that our text detection model achieved good training and verification effects, in terms of detection accuracy and iterative loss.

(1)

(2)

Figure 7. Training effect of our text detection model on sensitive digital archive images

6. Conclusions

This paper mainly examines how to apply image processing and identification technology for digital archive information management. First of all, the authors summed up the difficulties in digitalizing paper archives, and developed a tilt correction method for digital archives. After that, the adaptive histogram equalization was improved to effectively enhance the digitalization quality of archive images. Then, the authors demonstrated the image processing flow the digitalization of archive images, and adopted the horizontal projection histogram to quickly extract and detect the archive image texts being digitalized. Through experiments, the authors compared the original and enhanced Chinese and English N-M images. Compared with the original images, the enhanced images contained rich image details, and sparse noises. In addition, the performance indices, F1-scores, and time costs of different detection algorithms were obtained through Chinese and English text detection experiments. The results confirm that our algorithm has superior precision, recall, and F1-score in text detection. Finally, the authors tested the training effect of our text detection model on sensitive digital archive images. It was found that our text detection model achieved good training and verification effects, in terms of detection accuracy and iterative loss.

  References

[1] Bishi, A. (2015). Digital archiving-the current state at the National Archives of Zimbabwe. In 2015 Digital Heritage, 2: 403-404. https://doi.org/10.1109/DigitalHeritage.2015.7419534

[2] Güldenpfennig, F., Fitzpatrick, G. (2015). Personal digital archives on mobile phones with MEO. Personal and Ubiquitous Computing, 19(2): 445-461. https://doi.org/10.1007/s00779-014-0802-3

[3] Kim, H., Yeom, H. (2017). Improving small file I/O performance for massive digital archives. In 2017 IEEE 13th International Conference on e-Science (e-Science), 256-265. https://doi.org/10.1109/eScience.2017.39

[4] Varela, M.E. (2015). Digital archives and digital methods: A Indonesian case study. Digital Libraries: Providing Quality Information, 9469: 340-341. https://doi.org/10.1007/978-3-319-27974-9

[5] Delaney, B., De Jong, A. (2015). Media archives and digital preservation: Overcoming cultural barriers. New Review of Information Networking, 20(1-2): 73-89. https://doi.org/10.1080/13614576.2015.1112626

[6] Mitchell, E. (2015). The Registries of Shared Print and Shared Digital Archives: What They Mean for Libraries. Technical Services Quarterly, 32(2): 173-186. https://doi.org/10.1080/07317131.2015.998480

[7] Weber, N., Fenlon, K., Organisciak, P., Thomer, A.K. (2019). Workshop on Conceptual Models in Digital Libraries, Archives, and Museums. In 2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL), pp. 457-458. https://doi.org/10.1109/JCDL.2019.00117

[8] Maltese, V., Giunchiglia, F. (2016). Search and analytics challenges in digital libraries and archives. Journal of Data and Information Quality (JDIQ), 7(3): 1-3. https://doi.org/10.1145/2939377

[9] Cerf, V.G. (2017). The role of archives in digital preservation. Communications of the ACM, 61(1): 7-7.

[10] Yang, Y. (2015). Research on application of digital literature archives management based on XML Database System. Journal of Digital Information Management, 13(5): 367-372.

[11] Xu, W., Esteva, M., Trelogan, J. (2018). Cyberinfrastructure for digital libraries and archives: Integrating data management, analysis, and publication. In Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries, pp. 423-424. https://doi.org/10.1145/3197026.3200211

[12] Ouyang, H. (2016). Preliminary study on safety technology of digital archives. In 2016 International Conference on Network and Information Systems for Computers (ICNISC), pp. 284-286. https://doi.org/10.1109/ICNISC.2016.069

[13] Borden, B.B., Baron, J.R. (2016). Opening up dark digital archives through the use of analytics to identify sensitive content. In 2016 IEEE International Conference on Big Data (Big Data), pp. 3224-3229. https://doi.org/10.1109/BigData.2016.7840978

[14] Lin, W.X., Zuo, J.Q., Su, S., Chen, C. (2019). A double-blockchains based Digital Archives Management Framework and Implementation. In 2019 IEEE 14th International Symposium on Autonomous Decentralized System (ISADS), pp. 1-6. https://doi.org/10.1109/ISADS45777.2019.9155628

[15] Li, X., Yin, X., Gu, T. (2019). Common air pollutants and their prevention in digital archives. In IOP Conference Series: Earth and Environmental Science, 300(3): 032017. https://doi.org/10.1088/1755-1315/300/3/032017

[16] Caruana, M. (2016). Analysis of data from Maltese Passport Applications held at the National Archives of Malta: A new digital resource. New Review of Information Networking, 21(1): 52-62. https://doi.org/10.1080/13614576.2016.1234842

[17] Blezinger, D., Van Den Hoven, E. (2016). Storytelling with objects to explore digital archives. In Proceedings of the European Conference on Cognitive Ergonomics, pp. 1-7. https://doi.org/10.1145/2970930.2970944

[18] Apollonio, F.I. (2017). The production of 3D digital archives and the methodologies for digitally supporting research in architectural and urban cultural heritage. In Digital Research and Education in Architectural Heritage, pp. 139-158. https://doi.org/10.1007/978-3-319-76992-9_9

[19] Goto, M. (2018). Current Movement of" Digital Archives in Japan" and" khirin (Knowledgebase of Historical Resources in Institutes)". In 2018 Pacific Neighborhood Consortium Annual Conference and Joint Meetings (PNC), pp. 1-4. https://doi.org/10.23919/PNC.2018.8579461

[20] Andrea Ludovico, L., Baratè, A., Simonetta, F., Andrea Mauro, D. (2019). On the adoption of standard encoding formats to ensure interoperability of music digital archives: The IEEE 1599 format. In 6th International Conference on Digital Libraries for Musicology, 20-24. https://doi.org/10.1145/3358664.3358665

[21] Wells, C.M. (2019). Total digital access to the league of nations archives: Digitization, DIGITALIZATION, AND ANALOG CONCERNS. In Archiving Conference, 1: 12-16. 

[22] Meyer, H., Bruder, I., Finger, A., Heuer, A. (2015). Building digital archives: Design decisions: A best practice example. In 2015 4th International Symposium on Emerging Trends and Technologies in Libraries and Information Services, 59-64. https://doi.org/10.1109/ETTLIS.2015.7048172

[23] Wang, J.H., Chang, H.C. (2015). CoBITs: a distributed indexing approach to collaborative content-based multimedia retrieval across digital archives. Multimedia Tools and Applications, 74(8): 2639-2658. https://doi.org/10.1007/s11042-013-1461-5

[24] Zeng, L., Liang, H., Meng, L., Yang, Y., Guo, Q. (2021). Intelligent Recognition of Digital Archives of Application and Installation in Power Business Expanding Based On Image Recognition Technology. In 2021 IEEE International Conference on Computer Science, Artificial Intelligence and Electronic Engineering (CSAIEE), pp. 143-146. https://doi.org/10.1109/CSAIEE54046.2021.9543456

[25] Soh, L.K., Lorang, E., Liu, Y. (2018). Aida: intelligent image analysis to automatically detect poems in digital archives of historic newspapers. In Thirty-Second AAAI Conference on Artificial Intelligence.

[26] Erickson, B.J., Persons, K.R., Hangiandreou, N.J., James, E.M., Hanna, C.J., Gehring, D.G. (2001). Requirements for an enterprise digital image archive. Journal of Digital Imaging, 14(2): 72-82. https://doi.org/10.1007/s10278-001-0005-0

[27] Merzlyakov, N.S., Rubanov, L.I., Karnaukhov, V.N. (2002). Multiscale image presentation in a digital archive. In Second International Conference on Image and Graphics, 4875: 1067-1074. https://doi.org/10.1117/12.477115