# Read Side PKT\_RX

The Read Side of PKT\_RX reads data from DDR3 memory, reformats it and streams it to the filterbank module.

## Architecture



Read Side block diagram

## MAIN\_CTRL module

The main controller performs the following tasks:

- 1. Keeps track of the requested number of integration batches.
- 2. Controls the integration and data streamout procedure.
- 3. Calculates the start addresses for every integration.
- 4. Controls communication with the delay module.
- 5. Provides control signals to filterbank and framer.

These tasks are explained in detail below.

### Keeping track of the requested number of integration batches

The batch\_size value, written through the MM-interface, indicates how many integrations are requested. This number is stored in a FIFO. This FIFO contains an 'empty' flag, which is asserted initially, but will be deasserted as soon as 2 batch\_size values are written. This results in signal 'this\_fn\_ready' being asserted, which indicates that data processing can start. The batch\_size value that will be read out of the FIFO is called 'numintegrations'.

#### Controlling the integration and data streamout procedure

When an active high 'all\_fn\_ready' signal is received, the main state machine is started. It reads a value from the batch\_size FIFO, decrements the 'integration\_count' and starts the process of streaming out data from the DDR3 memory by means of the assertion of the 'start\_data\_stream' signal. The state machine then transitions from IDLE to INTEGRATE.

In the INTEGRATE state, it checks for the 'end\_cycle' signal to be asserted, indicating that the integration of a station pair is finished. If not all 4 station pairs have yet been processed, it starts the process of streaming out data from DDR3 again and waits for a new 'end\_cycle' signal. When the 'end\_cycle' signal is received for the last station pair, it checks whether the 'integration\_count' is 0. If not, then it calculates the new start address for reading the DDR3 and starts streaming out data from the DDR3 again. If 'integration\_count' is 0, then the integration process is terminated.

The data streamout process referred to above, is controlled by the streamout\_ctrl state machine, which is triggered by the main\_ctrl state machine. When the streamout\_ctrl state machine receives a 'start\_data\_stream' signal, then it requests the delay module to send delay data for the appropriate station pair. When these delay figures are received, a down counter is loaded with a predefined value (1023) to allow the streamout FIFOs to be filled with DDR3 data. During count down, the counter value is checked against the 7 LSBs of the coarse delay values from the delay module. These 7 LSBs represent the sample offset. As soon as the counter value equals the sample offset value, the data starts streaming out of the streamout FIFOs, which is triggered by 'enable\_streamout\_stX'. This procedure is done separately for both stations of the station pair, since they will have different delay figures. The purpose of this procedure is to align the samples before they are applied to the filterbank.

After sample alignment has been done, as much data is streamed out to the filterbank as requested through the number of FFTs per integration.

Calculating the start addresses for every integration

The address per station from which the DDR3 is read is called: 'start\_address\_st0' and 'start\_address\_st1'. This 'start\_address\_stX' is in units of 256 bit columns. It is the sum of two components: a base address which is incremented according to the number of FFTs processed and the delay figure, truncated to 128 samples. The figure below explains this.

| delay_coarse_stX |  |   | sample offset |  |   |
|------------------|--|---|---------------|--|---|
| 27               |  | 7 | 6             |  | 0 |

## Coarse delay

The calculation of 'base\_address' and 'start\_address\_stX' for one station is explained here.

Calculating 'base\_address':

- To read data for 1 FFT, 16 words of 256 bits must be read from the DDR3 memory.
- Integration time, 'tintegrate\_in\_ffts', is expressed in number of FFTs.
- So 'base\_address' := 'base\_address' + ('tintegrate\_in\_ffts' x 16)
- The updated 'base\_address' might exceed the buffer size, in which case 'base\_address' must be corrected.
- IF 'base\_address' > 'base\_address\_limit' 1 THEN 'base\_address' := 'base\_address' - 'base\_address\_limit' END IF

Calculation of `start\_address\_stX':

- IF 'delay\_coarse\_stX' < 0 THEN 'delay\_coarse\_stX' := 'delay\_coarse\_stX' + 'base\_address\_limit' END IF
- `delay\_coarse\_stX' := `delay\_coarse\_stX' + `base\_address'
- IF 'delay\_coarse\_stX' > 'base\_address\_limit' 1 THEN 'delay\_coarse\_stX' := 'delay\_coarse\_stX' - 'base\_address\_limit'
  END\_IF
- 'start address stX' = 'delay coarse stX'

So 'start address stX' is basically 'coarse delay stX' + 'base address'.

The value for 'base\_address\_limit' depends on the bandwidth of a subband and the number of bits per sample. In this case, the values are 16 MHz and 2 bps respectively. This results in:

$$\frac{16 \text{ MHz} \times 2 \text{ (Nyquist)} \times 2 \text{ bps} \times 4 \text{ s}}{256 \text{ (word size)}} = 10^6$$

Controlling communication with the delay module

The signals between the MAIN\_CTRL module and the delay module are generated (outputs) and evaluated (inputs) in several places inside MAIN\_CTRL. (It is set up such that the earlier established interface between fn\_delay\_module and pkt\_rx is kept the same). Detailed information can be found in the VHDL code.

Providing control signals to filterbank and framer

The control signals to the filterbank are 'out\_sync' and 'out\_valid'. They follow the same rules as sync and valid signals in other places in the design. The exact implementation can be found in the VHDL code.

The data from the filterbank to the framer is accompanied by control signals that are generated in PKT\_RX. These signals are 'start\_of\_fftperiod', 'ppf\_data\_valid' and 'fft\_in\_integration'. They arrive in time with the data at the framer, taking into account the latency through the filterbank. The exact implementation can be found in the VHDL code.

## SCHEDULER module

The SCHEDULER contains the READOP module which controls the actual reading from DDR3 memory. When it receives the 'start\_data\_readout' signal from the MAIN\_CTRL module, it copies the current 'start\_address\_stX' values and requests read access to the DDR3 controller. When access is granted, it does 8 read cycles of 256 bits (4 subbands, 2 polarizations) for both stations. The data read from

DDR3 is stored in the streamout FIFOs (explained in the next section). As long as 'start\_data\_readout' is active and the streamout FIFOs are less than half full, it keeps reading from DDR3.

When either one of the streamout FIFOs is half full, filling that FIFO is suspended, until the half full flag is asserted again. If 'start\_data\_readout' is low, indicating that all data for a particular integration has been read from DDR3, the state machine in READOP transitions back to IDLE.

The read address that is used to read from DDR3 is built as follows:

| 26:25        | 24            | 23:22      | 21        | 20:0         |
|--------------|---------------|------------|-----------|--------------|
| source_group | station_count | band_count | pol_count | column_count |

Where: source\_group is an input to READOP, station\_count, band\_count and pol\_count are generated within READOP and column\_count is an internal copy of `start\_address\_stX', that gets updated during the process.

## STREAMOUT\_FIFO module

When the FIFO in this module has been given some time to fill up (1023 clock cycles) and the down counter as described in MAIN\_CTRL module section reaches the value of 'delay\_coarse\_stX(6:0)', the state machine in STREAMOUT\_FIFO module is started. This is done through 'enable\_streamout' being asserted.

The state machine transitions from IDLE to READ\_FIFO, in which state 8 read cycles of 256 bits each are done. The order of the data that is read is: band\_A\_pol\_0, band\_B\_pol\_0, band\_C\_pol\_0, band\_D\_pol\_0, band\_A\_pol\_1, band\_B\_pol\_1, band\_C\_pol\_1, band\_D\_pol\_1. This data is stored in registers.

Then the state machine transitions from READ\_FIFO to STREAMOUT. A counter is started that counts off 256 clocks. During these clocks, data is streamed out from the registers. On even count values band\_A and band\_B registers are streamed out, 2 bits at a time (being 1 sample) and on odd count values band\_C and band\_D registers are streamed out, 2 bits at a time (being 1 sample). Hence, 128 clock cycles are needed to completely stream out the data from the registers. The 2 bits data is mapped according to an encoding scheme that ensures maximum distance between the 4 possible values of the samples.

At the end of the stream out process, the 'enable\_streamout' signal is checked. If still asserted, new data is read form the FIFO, otherwise the process is terminated and the state machine transitions back to IDLE.