Rate-4/5 4-ary modulation code for 4-level holographic data storage systems

Abstract This paper presents a 4/5 4-ary modulation code design for 4-level holographic data storage (HDS) systems. The multi-level HDS systems, capable of far greater storage densities than traditional storage technologies, are regarded as one of the most promising candidates for ultra-high-capacity optical storage devices. Significantly, the HDS systems promises to provide the best solutions for cloud storage services. Yet, the systems are significantly disturbed by many challenges, such as two-dimensional (2D) interference, blur effect, and misalignment. The proposed code is designed based on a 4-step procedure, where the codewords at the output of the encoder can avoid not only the effect of the worst 2D interference but also obtain a minimum distance between two arbitrary codewords of 0.47, which is larger than the minimum distance of a conventional 6/9 modulation code. Simulation results demonstrate that the proposal gains about 1 dB better than the traditional 6/9 modulation code at a ${10^{ - 9}}$10−9 bit error rate.


Introduction
The massive growth of updated-regularly data has challenged traditional storage systems.To deal with ever-increasing demands, novel technologies are needed to expand the storage capacity.
Chi Dinh Nguyen ABOUT THE AUTHOR Chi Dinh Nguyen received the B.Eng. degree in Electrical and Electronic Engineering from the Le Quy Don Technical University, Hanoi, Vietnam, in 2012, and the M.Eng.and Ph.D. degrees in Information Telecommunication Engineering from Soongsil University, Seoul, South Korea, in 2015 and 2017, respectively.From 2017 to 2019, he was a Postdoctoral Research Fellow at the Singapore University of Technology and Design.From 2019 to 2021 he was a Lecturer/Researcher with the Phenikaa University.Currently he is a Director of IT Program with the FPT University, Hanoi, Vietnam.He was the first and major author of one best paper and two outstanding papers from the 2020 International Conference on Advanced Technologies for Communications and the Asia-Pacific Magnetic Recording Conference in 2016 and 2018.He is also the major inventor of four patents issued by South Korea so far.His main research interests include signal processing and coding for information storage systems, applying learning algorithms to channel detection and decoding, and digital transmission engineering for telecommunication systems.
Several new technologies for magnetic recording have been examined, including bit-patterned media recording and energy-assisted magnetic recording (Shiroishi et al., 2009).Emerging nonvolatile memories (Khan & Ghosh, 2020), such as three-dimensional NAND flash, resistive randomaccess memory (ReRAM), and spin-transfer torque magnetic random-access memory (STT-MRAM), are expected to replace existing technologies, such as established random-access memory (RAM), like dynamic or static RAM, to lead the electronic memory industry.As for optical storage devices, holographic data storage (HDS) systems have become the most esteemed candidates for ultrahigh-density optical storage systems because of the strengths of fast access time, extensive storage capability, greener storage, and potential uses.The HDS optical working principle can be briefly described as follows.The HDS system stores data throughout the volume of the medium rather than its surface.The data sources are first formatted into m-ary symbols (e.g., binary bits in the case of the binary HDS), and the symbols are then converted into multi-level square pixels.A page composing device, using a laser, is used to impress the pixels onto a signal beam.The signal beam is interfered with a reference beam for recording an interference grating (page) inside tiny holograms.Adjusting the laser angle (angular multiplexing technique) allows various pages to be recorded in one volume.The recorded data can be retrieved by combining an incident reference beam on the medium and a high-performance detector.
Apart from its potential, the HDS systems must solve many challenges before enterprises can commercialize HDS devices.The main challenges, in terms of signal processing, can be named as misalignment, inter-page interference (IPI), blur effect, and two-dimensional (2D) interference (Gertz et al., 2015(Gertz et al., , 2016;;Katano et al., 2017;Nguyen & Lee, 2015b).Intersymbol interference (ISI) is inherent in high-density storage systems, and the effect becomes significant in 2D storage systems.In other words, 2D interference is regarded as the primary defect that may seriously deteriorate the quality of retrieved data.
Unlike binary HDS systems, multi-level HDS allows one pixel to represent more than one bit.This advantage enables the multi-level HDS systems to store more information under the same resolution rate (Lihua et al., 2005).However, the effects mentioned above, especially the 2D interference, can seriously deteriorate the performance of retrieval data for multi-level HDS systems.The 2D interference is, in fact, a very challenging problem for multi-level HDS systems and most of any other kind of today's promising high-density storage systems.In order to surmount this problem, many solutions have been proposed (Chen et al., 2015;J. Kim & Lee, 2009a, 2009b;Nakamura & Hoshizawa, 2016;Nguyen & Lee, 2017;Zheng et al., 2021).Some efficient approaches can be listed as advanced partial response maximum likelihood (PRML) detections (J.Kim & Lee, 2009a;Zheng et al., 2021) and improved 2D softoutput Viterbi algorithm (2D SOVA) ((J.Kim & Lee, 2009b)- (Nguyen & Lee, 2017)).These solutions are implemented during information retrieval.Advanced encoding schemes are often deployed before the data is recorded into the storage channel.These techniques significantly improve the performance of stored data and increase the system's complexity (Chen et al., 2015;Nakamura & Hoshizawa, 2016).
Another approach is the modulation codes (Immink & Cai, 2018;Y. Kim et al., 2013;Nguyen & Lee, 2015a;Nguyen et al., 2021;Park et al., 2013) that shape the character of the recorded information sequence before recording into the media.Several modern modulation codes (Immink & Cai, 2018;Nguyen & Lee, 2015a;Nguyen et al., 2021) have been proposed for nextgeneration data storage systems.Recently, simple-and-efficient multi-level modulation codes have been proposed for multi-level storage devices.For instance, the authors in (Y.Kim et al., 2013) have proposed several multi-level modulation coding to enhance the data reliability by converting the destructive signal patterns into further robust ones for flash memories.A 6/9 4-ary modulation code (Park et al., 2013) was introduced, wherein six input symbols were converted into nine output symbols, a size of 3 × 3, yielding a code rate of 0.667.These output symbols were arranged so that the smallest level and largest level symbols did not close together.The bit-errorrate (BER) and the symbol-error-rate (SER) performances of the traditional 6/9 code are improved over those of the random sequence.The 6/9 modulation code is used as a benchmark to evaluate the proposal.
Besides matching the features of the storage media, robust modulation codes need to be constructed with a high code rate and have some error-correcting capability.In this paper, we report, for the first time, a 4/5 4-ary modulation code for 4-level HDS systems.The proposed code is constructed, with a high code rate and a simple encoder and decoder, to avoid the worst 2D interference.Primarily, with a proposed encoding procedure, the 4/5 4-ary modulation code obtains a minimum distance of 0.47 between any two codewords in the codeword set.
The rest of this paper is organized as follows.The proposed 4/5 4-ary modulation code is presented in Section 2. Section 3 briefly presents the holographic channel model.Section 4 shows some primary simulation results.Concluding remarks are shown Section 5.

Proposed 4/5 4-ary modulation code
Modulation codes are designed within some constraints to reduce the risk of corrupting signals during the writing and reading processes.In multi-level HDS systems, the maximum 2D interference occurs when the largest symbol is next to the smallest.The largest symbol is "3," and the smallest is "0" in this study.Thus, when the "0" symbol (or "3" symbol) is surrounded by many of the "3" symbols (or "0" symbols), the 2D interference is most severe.Figure 1a shows typical interference patterns and the worst 2D interference.Figure 1b illustrates some 2D interference on a particular page.As can be seen, many patterns of the largest symbol next to the smallest symbol can be generated.To improve the system performance, the phenomenon must be forbidden for the codewords at the output of the encoder.Besides avoiding the worst 2D interference, the proposed 4/5 4-ary modulation code is designed to enlarge the trellis's free distance and enhance the error-correcting capability.
The proposed encoder translates a 4-symbol user-data sequence into a 5-symbol codeword of 5 × 1.Among 1024, i.e., 1024 ¼ 4 5 ;possible output codewords, we deliberately select 256 best codewords for 256, i.e., 256 ¼ 4 4 ; input data possibilities.We first begin with a definition of the 5-symbol codeword of 5 × 1, As mentioned earlier, if a symbol (in the largest level or the smallest level) is surrounded by neighboring symbols that have opposite values (in the smallest lever or the largest level), the symbol tends to be incorrectly detected and easily causes an error for detection due to the effect of the worst 2D interference.In other words, the patterns of [0 3 0] and [3 0 3] must be avoided in any codeword.The proposed modulation code can be constructed in four steps as follows, Steps 1. Observe three middle symbols, i.e., c[p−1], c[p], c[p +1].We discard the patterns that cause the worst 2D interference from the three symbols.In other words, the codewords of A030A ½ � T and A303A ½ � T must be avoided from the codeword set, where A 2 0; 1; 2; 3 f g: Discarding the codewords reduces suitable codewords to 992, i.e., 992 Step 2. The 992 remaining codewords are divided into two groups.The first group includes the codewords containing the patterns of AB030 ½ � T , AB303 ½ � T , 030BA ½ � T , and 303BA ½ � T ; where A 2 0; 1; 2; 3 f g and B 2 0; 1; 2 f g: The other group includes the codewords, which do not contain the patterns of [0 3 0] and [3 0 3].The codewords of the first group are listed in Table 1.The destructive codewords of the first group is equal to 48, i.e., 48 ¼ 4 � 12.All codewords of the first group must be excluded as well.Therefore, after the second step, the possible remaining codewords are 944, i.e., 944 ¼ 992 À 48.
Step 3. Choose the number of codewords, such as the minimum Euclidean distance between two selected codewords is 0.47.To do that, we first denote the intensities of pixels for the 4-ary symbols {0, 1, 2, 3} as {0, 0.33, 0.66, 1}.Then, using a force search, we select a codeword set with a minimum Euclidean distance of 0.47.The distance is suitable for generating the number of best-possible codewords for the code rate of 4/5 and significantly enlarging the free distance on the trellis for improving the error-correcting capability.The simulation result shows there are 316 suitable codewords.
Step 4. Calculate the average value of symbol difference between neighbor cells in a codeword, and select the first 256 codewords having the smallest average value.For simplicity, let c p ½ � j ð Þ denote the symbol in the p th cell in the codeword j.The average value of the symbol value difference a j ð Þ is calculated as follows, The best 256 codewords are assigned to the 256 4-symbol input data possibilities using a 1:1 mapping rule.A list of the 256 codewords of the proposed code in 4-ary form is shown in Table 2.The codewords are numbered from 0 to 255, top-down and left-to-right.We do the mapping based on the First In First Out rule.This straightforward mapping technique is deployed to reduce the complexity of the encoding and decoding process.the codeword in a part of the page encoded by the proposed code.As observed, the worst case of 2D interference is no longer constructed.Yet, it is essential to note that some patterns of "0-by−3" or "3-by−0" are still allowed, aiming to guarantee a reasonable code rate for the proposal.
The maximum likelihood decision decodes the received codeword via a demodulator (decoder.)The codewords picked out from the codeword set as the transmitted ones are those whose Euclidean distance δ to the received codeword is minimum.Thereby, where ĉ j ð Þ is the j th received codeword, and c j ð Þ is the j th codeword in the codeword set.ĉ p ½ � j ð Þ is the p th symbol of the j th received codeword, and c p ½ � j ð Þ is the p th symbol of the j th codeword in the codeword book.Finally, the de-mapping process is needed to obtain the original 4-symbol information sequences.

HDS channel model
We examined the proposed modulation code over the HDS channel.For simplicity while without the loss of consistency of these previous researches, we used a point spread function (PSF) for modeling the HDS channel.The continuous PSF can be expressed as follows, where σ b is the grade of the blur, sinc function is defined as, sinc x; y ð Þ ¼ sin πx=πx ð Þ � sin πy=πy ð Þ, m x is the x-axis misalignment, m y is the y-axis misalignment, and A is signal amplitude (A ¼ 1 herein).For a discrete PSF, since about 98% energy of a pixel is incident on the 5 � 5 array (Keskinoz & Kumar, 2000;Vadde & Kumar, 1999), the light incident over the array is integrated into amplitude/intensity to yield a 5 � 5 discrete PSF.As a result, the discrete channel model (Keskinoz & Kumar, 2000;K. Kim et al., 2018;Nguyen & Lee, 2016;Vadde & Kumar, 1999), with the assumption that the fill factors are equal to 1 and are the same in both directions, can be mathematically formulated as, The proposed encoder converts the input data into a 2D data array.Then the signal is deteriorated through the HDS channel and additive white Gaussian noise (AWGN).The signal at the equalizer input r p; q ½ � can be expressed by the following,

Simulation results
The performance of the proposed modulation code is assessed over the HDS channel, where the datapage size is 1024 × 1024 pixels.We used 1000 pages for the simulation herein.To comprehensively evaluate the performance of the proposed modulation code, we compare the proposed 4/5 4-ary modulation code with the conventional 6/9 4-ary modulation code (Park et al., 2013), where the 6/9 4-ary modulation code was constructed based on mixing two complicated mapping tables.Moreover, it is essential to note that the minimum Euclidean distance between codewords in (Park et al., 2013) was equal to 0.33. Figure 4 shows the BER performance for both codes.The performances were estimated at the outputs of the SOVA detector and demodulator.As can be seen, the BER performances of the proposed 4/5 modulation code are much better than those of the 6/9 code.The proposed code can obtain a gain of about 1 dB over the 6/9 code.
Figure 5 shows the performance of the symbol error rate (SER) of the proposed 4/5 modulation code and the 6/9 code under blur and misalignment simultaneous effects.The misalignment of HDS has come from the physical limits of mechanical, electrical, and optical components.It may be one of the factors, and it can sometimes be all factors.Errors in circumstances can lead to severe misalignment.In this paper, we define the misalignment of the HDS system as follows, where T x ¼ 1 is the normalized pixel pitch on the x-axis.A similar definition is applied on the y-axis.The proposed code performs better than the other in all cases of misalignment.For example, the proposal code reaches the best performance (i.e., the lowest BER or no error anymore) if the SNR is greater than or equal to 16 dB.In comparison, at least 17 dB is needed to achieve the same performance using the conventional 6/9 code.Finally, we estimate the performance of both codes under the effect of blur.The estimation is carried out at 15 dB with and without the influence of misalignment.As shown in Figure 6, the performance of the codes varies with the grade of a blur.As the grade of blur is raised, the BER performance is degraded with and without misalignment in both cases.With better BER performances over the entire range of the grade of the blur, the proposed code is less sensitive to the blur than the traditional 6/9 code.In addition, it is also observed that, under the influence of misalignment, the performance of the codes tends to converge together as the blur increases.

Conclusion
We propose a 4-step procedure to design, for the first time, the 4/5 4-ary modulation code that satisfies both goals of the higher code rate and the longer free distance.The proposed code converts every 4-symbol user-data sequence to a 5-symbol codeword of 5 × 1 size.The proposed modulation code is constructed to enlarge the minimum distance of the code and avoid the worst 2D interference patterns for the multi-level HDS.Based on this design, the proposed code achieves about 20% code rate gain.The proposed code also boosts approximately 1 dB over the conventional 6/9 modulation code at 10 −9 BER.
Figure 1.a) Typically interference patterns; b) 2D interference on page.

Figure 2 .
Figure 2.An example of proposed code on page.

Figure
Figure 4. Comparison of BER performance at the outputs of the demodulator and SOVA at 1.0 blur.

Figure
Figure Comparison of SER performances at 1.0 blur.