FPGA based modified multi-repeat distribution matcher for probabilistic amplitude shaping

The technique of probabilistic amplitude modulation, based on distribution matching, has garnered considerable attention in recent years as a means to enhance spectral efficiency and diminish the constellation energy of coded modulation. This paper introduces the implementation of Probabilistic Amplitude Modulation (PAS) using a Modified Multi-Repeat Distribution Matcher (MMRDM) on a Field Programmable Gate Array (FPGA). The Modified Multiple Repetition Distribution Matcher (MMRDM) is integrated into a 2×2 Multiple Input Multiple Output Orthogonal Frequency Division Multiplexing (MIMO-OFDM) system, realized through the Xilinx System Generator (XSG). Simple Zero-Forced and Minimum Mean Squared Error (MMSE) equalizers are applied to the receiver for signal detection across the MIMO channel. The system incorporates enhanced security through chaos-based scrambling with 16 and 64 Quadrature Amplitude Modulation (QAM). VHDL code files for this system are generated for the Xilinx Kintex-7 (xc7k325t-3fbg676) for hardware implementation. Performance evaluation includes an assessment of required storage capacity, complexity, and bit error rate (BER). Using Vivado 2017.4, the system is successfully routed with resource utilization, for example, 0.67% Block RAM (BRAM), 68.6% Look-Up Tables (LUT), 83% DSP 48s, and 1.5% registers for 64-QAM uniform modulation. Similarly, for 64-QAM 10 level (shaper output 60 bit) shaped modulation, the resource utilization is 0.67% BRAM, 68.8% LUT, 83% DSP 48s, and 1.6% registers on the specified device. Simulation results demonstrate an improvement in the net shaping gain of approximately (2-4 dB) at 1×10^(-4) for different equalizer cases compared to uniform QAM, along with a notable reduction in required storage capacity and computational complexity.


INTRODUCTION
Probabilistic shaping (PS) has emerged as a potent tool in digital signal processing (DSP) for various communication systems, particularly in achieving capacity-approaching performance and adapting rates in accordance with the Shannon limit (Böcherer et al., 2015;Buchali et al., 2015).In typical PS systems, distribution matching (DM) is positioned outside forward error correction (FEC) coding for simplified implementation.The utilization of digital subcarriers enables water filling performance (Che & Shieh, 2018) and mitigates the impact of equalizer-enhanced phase noise (Sun et al., 2020).
Several researchers (Bobrov & Dordzhiev, 2023;Civelli & Secondini, 2020;Guiomar et al., 2020;Gültekin et al., 2019) have proposed diverse transmission systems based on probabilistic shaping, resulting in enhanced system performance and confirming the efficacy of probabilistic shaping technology across different communication systems.Probabilistic amplitude shaping (PAS) coding modulation, proposed as an integration strategy between modulation and channel coding (Böcherer et al., 2015), has garnered considerable interest in both wireless and optical communications.The architecture of probabilistic amplitude shaping involves CCDM (Constant Composition Distribution Matching), utilizing arithmetic coding as a shaping algorithm, with performance improvements observed for longer data packets (Schulte & Böcherer, 2015).However, this integration, while showcasing excellent performance at quasi-infinite DM word lengths, poses high complexity for hardware implementation due to the use of arithmetic coding, despite simplification from outer concatenation.
Various DM algorithms, such as enumerative sphere shaping (ESS) (Gültekin et al., 2018), multisetpartition DM (Fehenberger et al., 2018), and DM with shell mapping (Schulte & Steiner, 2019), have been explored.Look-up table (LUT) based DMs have been applied in fixed-length hierarchical DM (Yoshida, Karlsson, et al., 2019a), fixed-length optimum bit-level DM with binomial coefficients (Koganei et al., 2019), and variable-length prefix-free code DM (J.Cho & Winzer, 2019).In cases of limited source entropy live traffic, joint source-channel coding has been shown to simultaneously compress data and shape to reduce average symbol energy based on sorting of LUT contents in the DM with minimal added complexity (Yoshida, Karlsson, et al., 2019b).(İşcan et al., 2019) utilized probabilistic shaping to enhance system performance in the 5G new radio system by introducing a polar encoder after the shaping encoder, resulting in over 1dB improvements on AWGN channels without increased computing requirements.Probabilistic amplitude shaping can be effectively implemented on a Field Programmable Gate Array (FPGA), offering versatility in integration into an Orthogonal Frequency Division Multiplexing-Multiple Input Multiple Output (OFDM-MIMO) system.FPGAs are widely adopted in communication systems for their enhanced processing speed, parallelism in implementation, and cost reduction (Ganesh et al., 2015).Spatial multiplexing MIMO-OFDM implementation using FPGA has become a key area of interest for researchers (Chen et al., 2007;Kerttula et al., 2007;Mahendra Babu et al., 2014;Park & Ogunfunmi, 2012;Yoshizawa & Miyanaga, 2012), with various designs showcasing advancements in different aspects of communication systems.This paper proposes a modified method based on the multi-repeat distribution matcher principle (MRDM) (Jing et al., 2021;Yoshida et al., 2021;Yoshida, Binkai, et al., 2019) as a probabilistic shaping coder.The proposed method employs smaller lookup tables to create a coding scheme, significantly reducing required storage capacity without complex calculations, making it easily implementable on FPGA.The non-uniform distribution, akin to the MB distribution, is achieved by selecting the number of levels or subsets and symbols for repetition in each level, creating small lookup tables with the same number of levels according to the shaping rate required for shaping encoder and decoder formation.Altering the distribution enhances energy and spectrum efficiency, as the average symbol energy required to transmit data with a non-uniform distribution is lower than that of the same data transmitted with a uniform distribution at the same noise level.Mathematical analysis of bit error rate via AWGN and Massive-MIMO channels was performed, and simulations of MMRDM were conducted through a 2x2-MIMO OFDM channel with two different equalizers using the Xilinx System Generator.Results indicate significant improvements in storage capacity, computational complexities, and bit error rate (BER) when utilizing MMRDM compared to uniform QAM under the same channel and signal-to-noise ratio conditions.

SYSTEM MODEL
As illustrated in Figure 1, this paper employs a 2x2-MIMO OFDM communication system utilizing coded modulation based on rectangular QAM and MMRDM probabilistic shaping.On the transmitting end, the PAS encoder transforms the uniformly distributed bit stream into the desired distribution, which will be discussed in detail in the subsequent section.Following this, the conv-FEC encoder processes the shaped bits, and QAM mappers then associate these bits with QAM symbols.
Upon reaching the receiver and traversing the 2x2-MIMO channel, two types of equalizers (Zero-Forced and MMSE) are employed to reconstruct the original signal.The QAM demapper untangles the received signals by calculating Record Likelihood Ratios (LLRs) based on the observed signals.Subsequently, the FEC decoder receives the LLRs from the demapper to correct channel errors and maps them back to data bits through the shaping (invDM) decoder.

MODIFIED MULTI-REPEAT DISTRIBUTION MATCHER
As commonly understood, shaping coding operates on the principle of altering the distribution of data with a rate Rb from uniform to non-uniform.The objective is to minimize the average energy of symbols in the constellation by repeating points closest to the center, with lower energy, more frequently than distant points with higher energy.This approach is employed to enhance the overall performance of the system (K.Cho & Yoon, 2002).To achieve optimal results, the probability of the constellation signal point r ∈ R must adhere to Eq. 1.
Where  is a parameter governing the relationship and trade-off between average energy and bit rate, and as mentioned earlier, this distribution is the Maxwell-Boltzmann (MB) distribution.
In this proposed approach, the probabilistic shaping process involves the addition of extra bits to achieve a one-to-one mapping method, resulting in reduced computational complexity for the system.The method utilizes multiple repetition mapping of combinations instead of sequence permutation to derive the mapping sequence in the Probabilistic Amplitude Shaping (PAS).This technique is referred to as multi-repeat shaping (MRS) (Jing et al., 2021).The modification presented in this paper involves populating multiple lookup tables with constellation points based on the target rate as subsets, aiming to achieve a distribution similar to the MB distribution.Figure 2 provides an illustrative depiction of this method.
To begin, we define the number of levels or groups as L, followed by sub-groups of assignment groups denoted as  1 , ...,   .Symbols, representing the constellation's points, consist of a group of bits according to the order of the constellation.For each level or subset, lookup tables are established.The output from these tables signifies the probabilistic encoding of the original input information for a given packet length (Jing et al., 2021).It is crucial to carefully define the rules for selecting constellation symbols to ensure the flexibility of the shaping method.In the case of an M-QAM constellation, where there are M symbols, each symbol is composed of  =  2 () bits.Within each level, a certain number of these symbols will be repeated.For instance, in level  1 , the number of symbols is denoted as  1 .Subsequently, the  2 subset is formed from  1 , incorporating its symbols ( 2 ), and this process continues until reaching the final group   with its designated number of symbols,   .The repetition of symbols follows the energy rule, meaning that symbols with lower energy undergo more repetition.
This approach achieves a one-to-one mapping of symbols across all subsets for a given block length of the original data.This mapping is designed to avoid the computational complexity associated with a many-toone assignment resulting from a complex iterative method.
In this method, the main parameters that must be specified for the attainment of shaping encoding (Jing et al., 2021) are as follows: L: the number of levels N: the number of bits for each symbol within the M constellation  1 ,  2 , ...,   : the number of symbols in all the set groups m: the input information's block length Parameters are related to each other based on these mathematical relationships: Let us suppose that the number set of vectors satisfied MB distribution PA' is (3) In that case, the input block length m will be as follows:  = ⌊log 2  1 )⌋+. . .+⌊log 2   )⌋ (4) The shaping rate R is.
The codebook size of MRDM is The parameter L plays a crucial role in determining the trade-off between the divergence of the distribution from the MB distribution and the complexity of the shaping encoder.The maximum number of combined codewords is governed by L, and each symbol comprises N bits.Consequently, the storage capacity needed for the generated lookup table is calculated as 2  × ( × ) bits.This implies that an increase in the number of levels and the modulation order results in heightened computational complexity and a significant increase in storage capacity.For instance, in the scenario where L = 2 and the modulation is 16-QAM (meaning N=4) with levels  1 = 4 and  2 = 8, based on Eq. ( 4), the block length m is 5. Therefore, the size of the lookup table will be (2 5 × 8) bits.
This implies that the storage capacity required would be minimal, but there would be a greater distribution divergence.However, by setting L = 3 for the same modulation and selecting levels with  1 = 4,  2 = 8, and  3 = 16, based on Eq. ( 4), the block length m = 9.Consequently, the lookup table would be (2 9 × 12) bits, indicating an increase in storage capacity and a decrease in distribution divergence.Nevertheless, as the modulation order rises, the necessary capacity will surge significantly.To illustrate, consider 64-QAM with ten different energy levels and 64 symbols, where N = 6.Assuming L = 8 and levels  1 = 4,  2 = 4,  3 = 8,  4 = 8,  5 =16,  6 = 32,  7 =32,  8 =64 using Eq. ( 4), it results in m=30 and the size of the lookup table being (2 30 × 48), which is impractical due to its complexity.
To address this, we propose a modification to make the lookup table practical and easy to calculate.The lookup table is divided into smaller tables, each containing symbols equal to the number of symbols in each group.Returning to the previous examples, when L = 2, the size of the lookup table is 256 bits (Jing et al., 2021) Table I.With the modification, two lookup tables are created: the first containing 4 symbols, each represented by 4 bits, and the second containing 8 symbols represented by 4 bits.This reduces the storage requirement to only 3 bits for encoding tables, without the need for any table during the decoding process, as depicted in Figure 3.At L=3, the lookup table requires 6144 bits to store, but only 3 bits are needed when using multiple lookup tables for coding, as illustrated in Figure 4.For L=8 and 64-QAM modulation, the single lookup table needed would be (2 30 × 48), which is impractical.However, with our modification, only 18 bits are required for the shaping and de-shaping process, as shown in Figure 5.Eventually, from the MMRDM the probabilities of the generated distribution can be deduced as follows: QAM symbols probabilities are denoted by   (  ) = [ 1 ,  2 , … ,   ], with L groups having varying probabilities.group g1 comprises  1 symbols with equal probabilities:

MMRDM PERFORMANCE
As it is known that the QAM symbol error rate Ps through the AWGN channel can be expressed as (K.Cho & Yoon, 2002): (  ) .Thus,   <   and the value of dmin = 2A, which is why   = √ 3  2( − 1) ⁄ , the expression of dmin is And   >   Apply these rules in 16-QAM as a case in point.
In Eq. ( 6) QAM symbol error rate For 16-QAM with MMRDM, L=2 = 4 2 , then the amplitude   = √   4 ⁄   = √  sub.In Eq. ( 6) Bit error rate   =    log 2  Upon comparing equations 8 and 9, it becomes evident that the symbol error rate in MMRDM QAM surpasses that in standard QAM.This improvement is attributed to the shaping process, wherein the symbol energy rate for the constellation decreases, resulting in an increased minimum distance (dmin) between the symbols compared to uniform QAM (Yu et al., 2020).The performance of the proposed method can be assessed by calculating the achievable shaping gain ratio GMMRDM.This ratio is defined as the ratio between the shaping gain provided by PS-MMRDM using a QAM lattice of MMMRDM size and the gain provided by the uniform QAM constellation of MQAM size, expressed as follows: The gain ratio is determined to be 3.9dB by substituting the given values for  QAM = 10 2 and   = 4 2 .To compute the storage capacity required for the proposed method, two scenarios are considered.First, if the selected subsets are powers of 2, the storage complexity is ( ×    − ).However, if any subset ci is not a power of 2, then the storage capacity of the lookup table becomes (( ×    − ) + (    + ), where   =   = ⌊  ⌋ −  and   =    − (⌊  ⌋ + 2).
For ESS, the storage required depends on the number of energy levels L that satisfy the shaping rate Rs, which can be relatively large.The storage capacity required is expressed as (Gültekin et al., 2018) ( + )⌈  ⌉ bits.
Next is the calculation of computational complexity, representing the arithmetic operations needed for encoding and decoding in the shaping methods.For the proposed method, two cases are considered.In the first case, if the levels or subgroups are powers of 2, no calculation process is needed for the sender or receiver, as depicted in Figures 3 and 4.However, if the level is not a power of 2, simple operations are required to find the address of the symbol in the lookup table, whether for the transmitter or the receiver.
The time complexity for this subset is O(1).As for ESS, the number of bits operations required will be (|  − |)⌈  ⌉ ⌉ per symbol, whether for the transmitter or the receiver, where |  | represents the cardinality.
In conclusion, if subgroups are selected in the proposed method as powers of 2, the storage and computational complexity will be significantly lower compared to ESS.For instance, if the number of output symbols is 30, the modulation scheme used is 64-QAM, and m = 120, then the memory required for MMRSM is 60 bits only.

XILINX SYSTEM GENERATOR IMPLEMENTATION OF A MIMO-OFDM
Figure 6 illustrates the block diagram of the implemented key functions on the FPGA (Xilinx® kintex®-7 xc7k325t-3fbg676).Each block is constructed using Xilinx system generator (XSG) tools with a master clock period of 5 ns.The parameters employed for the MIMO-OFDM system are summarized in Table 1.
On the transmitter side, the random bit generator, operating at a rate of 200 Mbps, undergoes scrambling using a chaos-based stream ciphering method.The Multiple Repeat Distribution Matcher (MMRDM) alters the data distribution by introducing additional bits to the original bit, enabling the use of multiple lookup tables.MMRDM rates depend on the number of levels and the cardinality of each level according to equation 5.For instance, the rate of a 2-level MMRDM with 16-QAM is 5/8, and for a 3-level MMRDM, it is 3/4, as per equation 6. Figures 7 and 8 depict the XSG block diagram of 2 and 3 levels MMRDM encoders and decoders.The Convolution encoder encodes the binary scrambled stream bits with a code rate of ½, resulting in an output rate double that of the MMRDM rate.A matrix interleaver is then applied to produce interleaved stream bits.QAM modulation is utilized, generating complex modulated signals with a symbol rate equal to the Convolution encoder divide by N (symbol bits).The signal is then converted from serial to 2 parallel samples to achieve a rate twice that of the QAM symbol rate.Subsequently, the signal passes through the Inverse Fast Fourier Transform (IFFT) block to produce the OFDM signal, with the cyclic prefix guard omitted.One signal is transmitted through one antenna, and the other through the second antenna, effectively doubling the system's capacity.Following this, the signals are corrupted by a MIMO channel with flat fading and AWGN channels.On the receiving side, linear MMSE MIMO detection is employed to recover the clean signals.It is assumed that the channel estimation is perfect, and the channel parameters are directly inputted into the MMSE MIMO detection.All operations performed on the transmitted side are reversed on the receiving side to reconstruct the binary stream bits.Returning to the probabilistic shape encoder, the method's efficiency has been theoretically demonstrated in terms of the required storage capacity and computational complexity.This efficiency will be further validated when implemented in the OFDM-MIMO system on an FPGA.For instance, when generating a VHDL Code or Bitstream file for various systems, the resources needed for uniform 16-QAM modulation are 0.67% BRAM, 67.12% LUT, 80.95% DSP 48s numbers, and 1.48% registers.In comparison, the required resources for L2-MMRDM 16-QAM are 0.67% BRAM, 67.15% LUT, 80.95% DSP 48s numbers, and 1.5% registers.The selected device shows only a marginal increase of 0.037% in LUT and 0.02% in registers over the uniform 16-QAM resources, with no increase in DSP 48s numbers, as demonstrated in Table 2.This table reveals the resource utilization on (Xilinx® kintex®-7 xc7k325t-3fbg676).It is evident that the MMRDM technique consumes minimal resources, and both the MMRDM encoder and decoder require no BRAM or DSP slices.

RESULTS AND DISCUSSION
In the proposed system, the base station employs OFDM-MIMO and is implemented using XSG in R2022b MATLAB software.System performance is assessed by focusing on parameters such as bit error rate (BER) and modulation gain ratio, crucial for evaluating communication system reliability.In the simulation of the MMRDM-OFDM-MIMO system, a data block size ranging between 2, 246, 4000 and 2, 995, 2000 bits is selected.Figures 9 depict the BER as a function of signal-to-noise ratio (SNR) for 16QAM and 64QAM, considering a 2x2 MIMO system with both transmitting and receiving antennas.
In Figure 9(a), where the number of MMRDM groups is L=2 and L=3, the performance is compared to uniform 16-QAM.In Figure 9(b), the number of shaping groups is L=8 and L=10, and they are compared to uniform 64-QAM, specifically for the Zero-Forced type equalizer.Noticeable improvements in BER are observed with MMRDM compared to uniform QAM, showing an improvement of about (2-4) dB at 10 −4 BER. Figure 10 introduces the MMSE equalizer for the same probabilistic shaping combinations and FEC rate, demonstrating that MMRDM probabilistic modulation consistently provides better performance than uniform QAM in both systems, with an improvement of about 2-2.5 dB at 10 −4 BER.As evident in Figures 9 and 10, the number of levels (L) plays a crucial role in balancing the deviation of the input data distribution from the MB distribution, the encryption formation rate, and the associated complexities in terms of computational demands and required storage capacity.With an increase in L, the rate rises, and rate losses decrease, while the required storage capacity increases but remains low in MMRDM, as highlighted in Table 2.

CONCLUSIONS
This study explores the application of the MMRDM-based probabilistic shaping technique in a 2×2 MMSE MIMO-OFDM communication system implemented on an FPGA using the Xilinx system generator.Our findings demonstrate enhancements in system performance, simplification of the shaping process without imposing computational complexity or extensive storage requirements.Notably, the memory demand is minimal compared to the data block length.Furthermore, the application of Eq. ( 10), representing the shaping gain ratio, leads to substantial improvements in energy efficiency, coupled with a noteworthy reduction in Bit Error Rate (BER).Both results and mathematical analyses underscore the favorable shaping advantages conferred by the MMRDM method.
In the FPGA implementation, it is noteworthy that, as detailed in
Eb) denotes the minimum distance between constellation symbols.t is predicated on the number of symbols M, on the bit energy Eb as well as up to N=log2M the number of bits for each symbol.The symbols locations are (±A, ±3A,…,±(√ − 1)A) + i(±A, ±3A,…,±(√ − 1)A) for the square QAM constellation, for QAM average symbols energy   = 2 3 ( − 1) 2 , and also for MMRDM QAM average symbols energy   = ∑    (  )  =1

Table 2
Table 2, the MMRDM encoder and decoder exhibit efficiency, requiring neither (BRAM) nor DSP slices, and demonstrating low utilization of Look-Up Tables (LUT) and registers.