Skip to content

Latest commit

 

History

History
399 lines (350 loc) · 15.4 KB

0182-2023-12-03.md

File metadata and controls

399 lines (350 loc) · 15.4 KB

3 Dec 2023

Previous journal: Next journal:
0181-2023-12-02.md 0183-2023-12-04.md

GFMPW-1: Trying to fix UPW harden for gf180-rbz-fsm

  • Looked at Tholin's and ReJ's projects and they just use the generic SDC file; no changes at all.
  • Some of their designs have clock periods on the order of 6ns! Tholin's SID is 166.6ns (6MHz), tho.
  • make user_project_wrapper on this repo yields only fanout errors.
  • Added shortcut:
    # Shortcut for summary.py.
    # Usage: olsum SUMMARY_TYPE [DESIGN [EXTRA_ARGS]]
    # SUMMARY_TYPE is e.g. full-summary.
    # DESIGN defaults to user_project_wrapper.
    # EXTRA_ARGS is appended if supplied after DESIGN.
    function olsum() {
        stype=$1; shift
        design=${1:-user_project_wrapper}; shift
        echo summary.py --caravel --design "$design" --$stype $@
        summary.py --caravel --design "$design" --$stype $@
    }

Test 1

  • I'm doing a test without and with the generic SDC file. This is what I did:
    • Used 25ns CLOCK_PERIOD in UPW.
    • Kept
    • Moved all old run data (Legion 5 laptop) to openlane/ARCHIVED
    • Made top_ew_algofoogle (TEA) use wb_clk_i instead of io_in[11].
    • Loaded env: env-gf180
    • Started time make top_ew_algofoogle using 3 configured cores.
  • Next I did this:
    • Modify UPW config.json to use correct CLOCK_PORT for wb_clk_i.
    • make user_project_wrapper
    • Copy the symlinks of these runs and name them test-1, to refer back to this section.

Results for TEA:

  • Time: 12:25
  • GDS size: 26MB (26733240)
  • Details from olsum full-summary top_ew_algofoogle --run:
    • CoreArea_um^2: 910,571.1552
    • Util: 36.19%
    • Antennas: 1 pin, 1 net: Worst is only 102% (see reports/signoff/46-antenna_violators.rpt).
  • reports/signoff/37-sta-rcx_nom/multi_corner_sta.checks.rpt:
    • Fanout: 1452 vios, all either 6/4 or 8/4
    • Fastest max: 5.19
    • Fastest slew: 0
    • Fastest cap: 0
    • Slowest max: -20.42
    • Slowest slew: 12 vios, worst 3.52/3.0
    • Slowest cap: 0
    • Typical max: 0.5
    • Typical slew: 0
    • Typical cap: 0

Results for UPW:

  • Time: 3:18
  • GDS size: 28MB (28422028)
  • Details from olsum full-summary user_project_wrapper --run:
    • WNS: -9.79
    • TNS: -239.29
    • CoreArea_um^2: 8,724,457.9968 (always expected to be the same)
    • Antennas: 30 pin, 30 net: Worst is 352%. 5 exceed 210% (see reports/signoff/30-antenna_violators.rpt).
  • reports/signoff/20-sta-rcx_nom/multi_corner_sta.checks.rpt:
    • Fastest max: 3.46
    • Fastest slew: 10, worst is 7.01/2.60
    • Fastest cap: 5, worst is 0.66/0.23
    • Slowest max: -39.53
    • Slowest slew: 43, worst is 17.99/3.0
    • Slowest cap: 5, worst is 0.66/0.25
    • Typical max: -9.64
    • Typical slew: 15, worst is 10.57/3.0
    • Typical cap: 5, worst is 0.66/0.24
  • reports/signoff/20-sta-rcx_nom/multi_corner_sta.min.rpt (HOLD):
    • Fastest min: -1.48 -- HOLD VIO
    • Slowest min: -2.12 -- HOLD VIO
    • Typical min: -1.66 -- HOLD VIO

NOTE: I think we can ignore fanout in UPW because it is incorrectly using a fanout of 10 (instead of 4) and all fanouts should be from within the design macros anyway...?

Test 2

  • Change number of cores on the VM to 8, and maybe 6 or 7 for routing.
  • Do exactly the same runs, but with generic SDC file now added in.
  • Compare results, esp:
    • Does it change the GDS at all? Hopefully not. If it does, add 'antennas' to the checklists below.

Results for TEA:

  • NOTE: I've noticed there is a subtle difference between TEA test-1 and test-2. Same area and pin placement, and what LOOKS like basically the same cell placement, but slightly different routing.
  • Time: 13:46
  • GDS size: 26MB (26670610)
  • Details from olsum full-summary top_ew_algofoogle --run:
    • CoreArea_um^2: 910,571.1552
    • Util: 36.19%
    • Antennas: 0
  • reports/signoff/37-sta-rcx_nom/multi_corner_sta.checks.rpt:
    • Fanout: 1608 vios, most ~8/4, but some up to 34/4 (on clkbufs)!
    • Fastest max: 15.35
    • Fastest slew: 0
    • Fastest cap: 0
    • Slowest max: -20.83
    • Slowest slew: 37 vios, worst 6.19/3.00
    • Slowest cap: 0
    • Typical max: 9.1
    • Typical slew: 17 vios, worst 3.44/3.00
    • Typical cap: 0

Results for UPW:

  • Time: 3:23
  • GDS size: 28MB (28368806)
  • Details from olsum full-summary user_project_wrapper --run:
    • WNS: -9.9
    • TNS: -242.06
    • Antennas: 30 pin, 30 net: Worst is 353%. 5 exceed 210% (see reports/signoff/30-antenna_violators.rpt).
  • reports/signoff/20-sta-rcx_nom/multi_corner_sta.checks.rpt:
    • Fastest max: 3.4
    • Fastest slew: 10, worst is 6.66/2.60
    • Fastest cap: 5, worst is 0.63/0.23
    • Slowest max: -39.68
    • Slowest slew: 33, worst is 17.11/3.0
    • Slowest cap: 5, worst is 0.63/0.25
    • Typical max: -9.75 -- SETUP VIO
    • Typical slew: 13, worst is 10.05/3.0
    • Typical cap: 5, worst is 0.63/0.24
  • reports/signoff/20-sta-rcx_nom/multi_corner_sta.min.rpt (HOLD):
    • Fastest min: -1.51 -- HOLD VIO
    • Slowest min: -2.13 -- HOLD VIO
    • Typical min: -1.69 -- HOLD VIO

Test 3

  • Change UPW CLOCK_PERIOD to 35 instead of 25. Does it make a difference? Only need to run it for UPW.

Results for UPW:

  • Time: 3:39
  • GDS size:
  • Details from olsum full-summary user_project_wrapper --run:
    • WNS: 0??
    • TNS: 0??
    • Antennas: 30 pin, 30 net: Worst is 353%. 5 exceed 210% (see reports/signoff/30-antenna_violators.rpt) -- same as Test 2
  • reports/signoff/20-sta-rcx_nom/multi_corner_sta.checks.rpt:
    • Fastest max: 10.97
    • Fastest slew: 10, worst is 6.66/2.60
    • Fastest cap: 5, worst is 0.63/0.23
    • Slowest max: -29.68
    • Slowest slew: 33, worst is 17.11/3.0
    • Slowest cap: 5, worst is 0.63/0.25
    • Typical max: 0.25 -- OK!
    • Typical slew: 13, worst is 10.05/3.0
    • Typical cap: 5, worst is 0.63/0.24
  • reports/signoff/20-sta-rcx_nom/multi_corner_sta.min.rpt (HOLD):
    • Fastest min: -1.51 -- HOLD VIO
    • Slowest min: -2.13 -- HOLD VIO
    • Typical min: -1.69 -- HOLD VIO

The only thing that changes is setup timing (and evidently by exactly the change in CLOCK_PERIOD).

Test 4

  • Change TEA config.json to include:
    "PL_RESIZER_HOLD_SLACK_MARGIN": 0.4,
    "GLB_RESIZER_HOLD_SLACK_MARGIN": 0.2,
    "PL_RESIZER_HOLD_MAX_BUFFER_PERCENT": 80,
    "GLB_RESIZER_HOLD_MAX_BUFFER_PERCENT": 80,
    ...and if necessary do something similar for _SETUP_ equivalents. Note that the figures above represent ×4 for Holds, and would be ×8 for Setups.
  • Run on Legion 7 laptop.

Results for TEA:

NOTE: This failed with a crash as it has done a few times at step 20, so I changed FP_CORE_UTIL from 35 to 34.

  • Time: 15:09
  • GDS size: 27MB (27462182)
  • Details from olsum full-summary top_ew_algofoogle --run:
    • WNS: 0
    • TNS: 0
    • Dimensions: 970x968
    • CoreArea_um^2: 939,115.3408
    • Util: 35.09%
    • Antennas: 1 (123%)
  • reports/signoff/37-sta-rcx_nom/multi_corner_sta.checks.rpt:
    • Fanout: 1608 vios, most ~8/4, but some up to 32/4 (on clkbufs)!
    • Fastest max: 15.49
    • Fastest slew: 0
    • Fastest cap: 0
    • Slowest max: -21.42
    • Slowest slew: 43 vios, worst 5.87/3.00
    • Slowest cap: 0
    • Typical max: 8.88
    • Typical slew: 17 vios, worst 3.26/3.00
    • Typical cap: 0

Results for UPW:

  • Time: 3:19
  • GDS size:
  • Details from olsum full-summary user_project_wrapper --run:
    • WNS: -10.12
    • TNS: -251.86
    • Antennas: 28 pin, 28 net: Worst is 351%. 5 exceed 210% (see reports/signoff/30-antenna_violators.rpt) -- NOTE: MOST are on LA pins
  • reports/signoff/20-sta-rcx_nom/multi_corner_sta.checks.rpt:
    • Fastest max: 3.23
    • Fastest slew: 10, worst is 7.51/2.60
    • Fastest cap: 5, worst is 0.71/0.23
    • Slowest max: -40.27
    • Slowest slew: 39, worst is 19.29/3.0
    • Slowest cap: 5, worst is 0.71/0.25
    • Typical max: -9.97
    • Typical slew: 13, worst is 10.05/3.0
    • Typical cap: 5, worst is 0.63/0.24
  • reports/signoff/20-sta-rcx_nom/multi_corner_sta.min.rpt (HOLD):
    • Fastest min: -1.44 -- HOLD VIO
    • Slowest min: -1.99 -- HOLD VIO
    • Typical min: -1.61 -- HOLD VIO

Test 5

  • Ease off hold slack margins (×2 instead of ×4).
  • Reduce FP_CORE_UTIL from 34 to 30.
  • Reduce PL_TARGET_DENSITY from 0.45 to 0.4

Results for TEA:

  • Time: 12:08
  • GDS size: 27MB (27470586)
  • Details from olsum full-summary top_ew_algofoogle --run:
    • WNS: 0
    • TNS: 0
    • Dimensions: 1033x1031
    • CoreArea_um^2: 1,064,610.5344
    • Util: 30.95%
    • Antennas: 0
  • reports/signoff/37-sta-rcx_nom/multi_corner_sta.checks.rpt:
    • Fanout: 1654 vios, most ~8/4, but some up to 42/4 (on clkbufs)!
    • Fastest max: 15.46
    • Fastest slew: 0
    • Fastest cap: 0
    • Slowest max: -20.35
    • Slowest slew: 49 vios, worst 5.31/3.00
    • Slowest cap: 0
    • Typical max: 9.33
    • Typical slew: 14 vios, worst 3.13/3.00
    • Typical cap: 0

Results for UPW:

  • Time: 3:19
  • GDS size: 28MB (29161934)
  • Details from olsum full-summary user_project_wrapper --run:
    • WNS: -9.67
    • TNS: -237.17
    • Antennas: 28 pin, 28 net: Worst is 300%. 4 exceed 210% (see reports/signoff/30-antenna_violators.rpt) -- NOTE: MOST are on LA pins, but the WORST are on IOs
  • reports/signoff/20-sta-rcx_nom/multi_corner_sta.checks.rpt:
    • Fastest max: 3.42
    • Fastest slew: 10, worst is 6.24/2.60
    • Fastest cap: 5, worst is 0.59/0.23
    • Slowest max: -39.20
    • Slowest slew: 45, worst is 16.03/3.0
    • Slowest cap: 5, worst is 0.59/0.25
    • Typical max: -9.52
    • Typical slew: 10, worst is 9.42/3.0
    • Typical cap: 5, worst is 0.59/0.24
  • reports/signoff/20-sta-rcx_nom/multi_corner_sta.min.rpt (HOLD):
    • Fastest min: -1.50 -- HOLD VIO
    • Slowest min: -2.18 -- HOLD VIO
    • Typical min: -1.69 -- HOLD VIO

Test 6

  • Better assignment of IOs and LAs
  • Reduce or remove gpout muxes
  • Rename top_ew_algofoogle to top_raybox_zero_fsm
  • Remember:
    • Create top_raybox_zero_fsm.v in raybox-zero repo.
    • Create a snippet for it, to add to UPW.
    • Make registered outputs cover the new direct RGB outputs.
    • Change gf180-base README example to use new name.

IO pad assignments:

Digital WLCSP QFN Dir rbz-fsm function SoC dir SoC function
mprj_io[0] D7 31 IO JTAG: JTAG system access
mprj_io[1] E9 32 Out SDO: Housekeeping SPI
mprj_io[2] F9 33 In SDI: Housekeeping SPI
mprj_io[3] E8 34 In CSB: Housekeeping SPI
mprj_io[4] F8 35 In SCK: Housekeeping SPI
mprj_io[5] E7 36 In ser_rx: UART receive
mprj_io[6] F7 37 Out ser_tx: UART transmit
mprj_io[7] E5 41 In irq: External interrupt
mprj_io[8] F5 42 Out hsync Out flash2_csb: User area QSPI
mprj_io[9] E4 43 Out vsync Out flash2_sck: User area QSPI
mprj_io[10] F4 44 Out r0 IO flash2_io[0]: User area QSPI
mprj_io[11] E3 45 Out r1 IO flash2_io[1]: User area QSPI
mprj_io[12] F3 46 Out g0 IO
mprj_io[13] D3 48 Out g1 IO
mprj_io[14] E2 50 Out b0 IO
mprj_io[15] F1 51 Out b1 IO
mprj_io[16] E1 53 Out tex_csb IO
mprj_io[17] D2 54 Out tex_sclk IO
mprj_io[18] D1 55 IO tex_io0 IO
mprj_io[19] C1 57 In tex_io1 IO
mprj_io[20] C2 58 In tex_io2 IO
mprj_io[21] B1 59 In tex_io3 IO
mprj_io[22] B2 60 In vec_csb IO
mprj_io[23] A1 61 In vec_sclk IO
mprj_io[24] C3 62 In vec_mosi IO
mprj_io[25] A3 2 In reg_csb IO
mprj_io[26] B4 3 In reg_sclk IO
mprj_io[27] A4 4 In reg_mosi IO
mprj_io[28] B5 5 In debug_vec IO
mprj_io[29] A5 6 In debug_map IO
mprj_io[30] B6 7 In debug_trace IO
mprj_io[31] A6 8 In outs_reg_enb IO
mprj_io[32] A7 11 In mode0 Out spi_sck:
mprj_io[33] C8 12 In mode1 Out spi_csb:
mprj_io[34] B8 13 In mode2 In spi_sdi:
mprj_io[35] A8 14 Out gpout0 Out spi_sdo:
mprj_io[36] B9 15 Out gpout1 IO flash_io[2]: Optional XIP QSPI?
mprj_io[37] A9 16 Out gpout2 IO flash_io[3]: Optional XIP QSPI?

Results for TEA:

  • Time: 13:03
  • GDS size: 26MB (26574150)
  • Details from olsum full-summary top_raybox_zero_fsm --run:
    • WNS: 0
    • TNS: 0
    • Dimensions: 1024x1023
    • CoreArea_um^2: 1,047,347.4816
    • Util: 30.93%
    • Antennas: reports/signoff/46-antenna_violators.rpt
      • 1: 107%
  • reports/signoff/37-sta-rcx_nom/multi_corner_sta.checks.rpt:
    • Fanout: 1612 vios, most ~8/4, but some up to 36/4 (on clkbufs)!
    • Fastest max: 16.91
    • Fastest slew: 0
    • Fastest cap: 0
    • Slowest max: -20.59
    • Slowest slew: 26 vios, worst 5.46/3.00
    • Slowest cap: 0
    • Typical max: 9.37
    • Typical slew: 8 vios, worst 3.13/3.00
    • Typical cap: 0

Results for UPW:

  • Time: 3:26
  • GDS size: 27MB (28266134)
  • Details from olsum full-summary user_project_wrapper --run:
    • WNS: -9.63
    • TNS: -1712.88
    • Antennas: reports/signoff/30-antenna_violators.rpt
      • 18 pin, 18 net
      • Worst is 858%!
      • Most (14) exceed 210% -- NOTE: MOST are on IOs
  • reports/signoff/20-sta-rcx_nom/multi_corner_sta.checks.rpt:
    • Fastest max: -2.78
    • Fastest slew: 8, worst is 7.73/2.60
    • Fastest cap: 4, worst is 0.73/0.23
    • Slowest max: -39.44
    • Slowest slew: 26, worst is 19.83/3.0
    • Slowest cap: 4, worst is 0.73/0.25
    • Typical max: -9.48
    • Typical slew: 8, worst is 11.64/3.0
    • Typical cap: 4, worst is 0.73/0.24
  • reports/signoff/20-sta-rcx_nom/multi_corner_sta.min.rpt (HOLD):
    • Fastest min: -1.73 -- HOLD VIO
    • Slowest min: -2.70 -- HOLD VIO
    • Typical min: -2.03 -- HOLD VIO
  • NOTE: Hold violations at UPW stage seem to be confined to LAs: Most are on LA[0] (reset), with a couple of others (gpout_sel?):
    • 1x la_data_in[11]: -2.03
    • 1x la_data_in[8]: -1.28
    • 1x la_data_in[4]: -1.19
    • 263x la_data_in[0]: <= -1.06

Next steps

  • Test 7+
  • Update README to include instructions for hardening, inc. env vars.
  • Extra raybox-zero improvements: Store extra 2 bits from texture memory, and make it selectable via gpouts.
  • Change GFMPW1_SNIPPET_top_raybox_zero_fsm.v to use local names ('rbz_fsm_io...' instead of direct 'io...') so it can be easily wired up with a mux.
  • Do we need more a0s and a1s? Do we need to do io_out[] assignments for pins that are never outputs?