On Sun, Apr 29, 2012 at 10:49:44AM -0300, Mauro Carvalho Chehab wrote: > > [ 10.486440] EDAC MC: DCT0 chip selects: > > [ 10.486443] EDAC amd64: MC: 0: 2048MB 1: 2048MB > > [ 10.486445] EDAC amd64: MC: 2: 2048MB 3: 2048MB > > [ 10.486448] EDAC amd64: MC: 4: 0MB 5: 0MB > > [ 10.486450] EDAC amd64: MC: 6: 0MB 7: 0MB > > [ 10.486453] EDAC DEBUG: amd64_debug_display_dimm_sizes: F2x180 (DRAM > > Bank Address Mapping): 0x00000088 > > [ 10.486455] EDAC MC: DCT1 chip selects: > > [ 10.486458] EDAC amd64: MC: 0: 2048MB 1: 2048MB > > [ 10.486460] EDAC amd64: MC: 2: 2048MB 3: 2048MB > > [ 10.486463] EDAC amd64: MC: 4: 0MB 5: 0MB > > [ 10.486465] EDAC amd64: MC: 6: 0MB 7: 0MB > > [ 10.486467] EDAC amd64: using x8 syndromes. > > [ 10.486469] EDAC DEBUG: amd64_dump_dramcfg_low: F2x190 (DRAM Cfg Low): > > 0x00083100 > > [ 10.486472] EDAC DEBUG: amd64_dump_dramcfg_low: DIMM type: buffered; > > all DIMMs support ECC: yes > > [ 10.486475] EDAC DEBUG: amd64_dump_dramcfg_low: PAR/ERR parity: enabled > > [ 10.486478] EDAC DEBUG: amd64_dump_dramcfg_low: DCT 128bit mode width: > > 64b > > [ 10.486481] EDAC DEBUG: amd64_dump_dramcfg_low: x4 logical DIMMs > > present: L0: yes L1: yes L2: no L3: no > > [ 10.486485] EDAC DEBUG: f1x_early_channel_count: Data width is not 128 > > bits - need more decoding > > [ 10.486488] EDAC amd64: MCT channel count: 2 > > [ 10.486493] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc(): > > allocating 3692 bytes for mci data (16 ranks, 16 csrows/channels) > > [ 10.486501] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 0: rank0 > > (0:0:0): row 0, chan 0 > > [ 10.486506] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 1: rank1 > > (0:1:0): row 0, chan 1 > > [ 10.486510] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 2: rank2 > > (1:0:0): row 1, chan 0 > > [ 10.486514] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 3: rank3 > > (1:1:0): row 1, chan 1 > > [ 10.486518] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 4: rank4 > > (2:0:0): row 2, chan 0 > > [ 10.486522] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 5: rank5 > > (2:1:0): row 2, chan 1 > > [ 10.486526] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 6: rank6 > > (3:0:0): row 3, chan 0 > > [ 10.486530] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 7: rank7 > > (3:1:0): row 3, chan 1 > > [ 10.486534] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 8: rank8 > > (4:0:0): row 4, chan 0 > > [ 10.486538] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 9: rank9 > > (4:1:0): row 4, chan 1 > > [ 10.486542] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 10: rank10 > > (5:0:0): row 5, chan 0 > > [ 10.486546] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 11: rank11 > > (5:1:0): row 5, chan 1 > > [ 10.486550] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 12: rank12 > > (6:0:0): row 6, chan 0 > > [ 10.486554] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 13: rank13 > > (6:1:0): row 6, chan 1 > > [ 10.486558] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 14: rank14 > > (7:0:0): row 7, chan 0 > > [ 10.486562] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 15: rank15 > > (7:1:0): row 7, chan 1 > > > > DCT0 has 4 ranks + DCT1 also 4 ranks = 8 ranks total. > > > > Now your change is showing 16 ranks. Still b0rked. > > > No, DCT0+DCT1 have 16 ranks, 8 filled and 8 empty. So, it is OK. > > As I said before when you've pointed this bug (likel at v3 review), > edac_mc_alloc > doesn't know how many ranks are filled, as the driver logic first calls it to > allocate for the max amount of ranks, and then fills the rank with their info > (or let them untouched with 0 pages, if they're empty).
Basically you're saying you're generating dimm_info structs for all _possible_ dimms and the loop where this debug message comes from goes and marrily initializes them all although some of them are empty: + for (i = 0; i < tot_dimms; i++) { + chan = &csi[row].channels[chn]; + dimm = EDAC_DIMM_PTR(lay, mci->dimms, n_layers, + pos[0], pos[1], pos[2]); + dimm->mci = mci; + + debugf2("%s: %d: dimm%zd (%d:%d:%d): row %d, chan %d\n", __func__, + i, (dimm - mci->dimms), + pos[0], pos[1], pos[2], row, chn); + + /* Copy DIMM location */ + for (j = 0; j < n_layers; j++) + dimm->location[j] = pos[j]; ... definitely superfluous. Oh well, looking at edac_mc_alloc, it used to allocate structs for all csrows on the controller even though some of them were empty... Ok, then please remove this debug call because it is misleading. Having [ 10.486493] EDAC DEBUG: new_edac_mc_alloc: allocating 3692 bytes for mci data (16 ranks, 16 csrows/channels) is enough. You probably want to say how many channels/csrows there are, though: [ 10.486493] EDAC DEBUG: new_edac_mc_alloc: allocating 3692 bytes for mci data (16 ranks, 8 csrows, 2 channels) or something similar. Simply dump tot_dimms, tot_channels and tot_csrows and that's it. -- Regards/Gruss, Boris. Advanced Micro Devices GmbH Einsteinring 24, 85609 Dornach GM: Alberto Bozzo Reg: Dornach, Landkreis Muenchen HRB Nr. 43632 WEEE Registernr: 129 19551 _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev