/* ------------------------------------------------------------------------- */
DATE    :
	February 1999
PROBLEM DESCR:
	GSLT <> DC: DC FLT-no out of sync with GSLT

SOLUTION:
	Stuck bit in the communication between CFLTP and NIM-electronics;
after some moving of cables and reset of NIM-electronics the problem goes
away.

/* ------------------------------------------------------------------------- */
DATE    :
	January 1999
PROBLEM DESCR:
	CAL-SLT error report: RCAL: ###ARE3.ERROR mask #0080
at a regular rate of about 1 per second for the duration of every ZEUS run.
This lasted for about 4 days....
The GSLT investigated the type of error that occurred:
bit 15 of the SLT-error was set, meaning: error in global summing
at CAL-SLT Layer3.

SOLUTION:
	From a mail from Nichol Brummer:
"  It turns out that the CAL laser has been in
a wrong setting from friday till last night.
After it was switched back to its proper
setting, it is not anymore firing into random
physics events, causing overflows in the
CAL SLT energy summing."

/* ------------------------------------------------------------------------- */
DATE    :
	October 1998
PROBLEM DESCR:
	Data corruption in FCAL crate data (in data from all crates,
even crates that do not exist...(crate 7,10,12)), reported offline(?).

SOLUTION:
	Not caused by the CALDAQ system, but further down the chain: all
corrupted data passed through TLT branch #2 => problable error in
'RADSTONE' EVB-TLT board.

/* ------------------------------------------------------------------------- */
DATE    :
	October 1998
PROBLEM DESCR:
	At startup all DCs report a 'DC.BOOT.FAIL'.

SOLUTION:
	When switching off some crates to try to fix a problem with a flaky
transputer link the power to the NEVIS fan-out to the DC-crates had tripped,
but this was not noticed.
Switching on the power solved the problem immediately.

/* ------------------------------------------------------------------------- */
DATE    :
	June 1998
PROBLEM DESCR:
	The system comes to a halt in the 'GAINS' run of a calibration;
from the logfile can be seen that the readout-tp in RCAL crate 5 crashes.

SOLUTION:
	The debugger shows that there is some integer overflow in the calcu-
lations done at the end of the 'GAINS' run (calculating the 'cross gain
spread'), for DC 13.
The input to DC 13 was unplugged and the calibration runs went fine;
the input was replugged and the calibration runs still ran fine...
==> bad connection at the connector ?

/* ------------------------------------------------------------------------- */
DATE    :
	Jan 1998
PROBLEM DESCR:
	FCAL kept hanging in calibration runs, more precisely: the new FPC
crate -FCAL crate 5- was hanging while outputting data at the end of a
calibration run; no obvious reports in the logfile that point to a hardware
problem.

SOLUTION:
	After long investigations (code/debugger) I determined the interrupts
between the 2 transputers on the 2TP were not working;
a test with the CSB testprogram -a bit late, but I didn't expect a hardware
problem- showed a faulty TRIGGER tp;
exchanging the 2TP did not help, exchanging the 64-wire cable did;
it got damaged while being handled a lot (why, I don't know).

/* ------------------------------------------------------------------------- */
DATE    :
	Jan 1998
PROBLEM DESCR:
	Many boot problems in the FCAL crate when starting up;
the CSB test showed many "LKC boot problems";
this was one of the first times of starting up after a winter stop, to test
the addition of the FPC readout crate (FCAL crate 5).

SOLUTION:
	Many module swaps with the RCAL-CSB showed problems in an ARE, and LKS
and an LKC module.... all 3 very probably have a hardware failure.;
a 2tp broke around the same time.
Possible cause: 64-wire cable to CSB put in DigCard P2 connector (which has
its own pin definitions) instead of normal P2 (only slot 1 and 2 of
a Digital Card VME-crate !).

April 1998, modules are at NIKHEF:
2TP:
  had a linkbuffer chip with cover blown off...; repaired by INCAA.
LKS (ser.no.#5):
  had roasted buffer chip and a few C004 channels broken;
  board damaged; patched with wires; C004 linkswitch chip replaced.
ARE (ser.no.#7):
  buffer chip (LS31) replaced.
LKC (ser.no.#10):
  burnt-out buffer chip and 1 PCB track acted as fuse...;
  3 connected C012 broken; chips replaced, PCB track fixed with wire.

### This was a very serious incident which cost a lot of time and effort
    to fix; care should be taken when moving 2tp and CSB modules about !!

/* ------------------------------------------------------------------------- */
DATE    :
	Sep 6 1997
PROBLEM DESCR:
	GSLT complaining "waiting for data from CAL", GSLT CAL buffer empty.
	However CALDAQ complains about "RCAL buffers to EVB full",
	and indeed, from iserver.log, can be seen that the ROCOLLECT buffer
	in TPM is full, but EVB for some reason does not empty it.

	Clearly can be seen that the RCAL ROCOLLECT is blocked, because of
	full buffers. Note that FCAL and BCAL are also not empty:

>>>> FCAL ROCOLLECT STATUS<<<
head.ptr, tail.ptr =       9093     71017
calec.tail         =       9093
space.requirement  =       2236
evb.events.written =      98370
data.len           = 223,240,205,220
trigger.no         = 98368,98368,98368,98368
frontend.no        = 5,7,8,10

>>>> BCAL ROCOLLECT STATUS<<<
head.ptr, tail.ptr =      99369     22982
calec.tail         =      99369
space.requirement  =       2529
evb.events.written =      98370
data.len           = 180,168,199,183
trigger.no         = 98368,98368,98368,98368
frontend.no        = 0,4,8,10

>>>> RCAL ROCOLLECT STATUS<<<
head.ptr, tail.ptr =      48058     50917
calec.tail         =      48058
reserve.result     =         -1
space.requirement  =       2949
evb.events.written =      98334
data.len           = 168,175,278,458
trigger.no         = 98340,98340,98340,98340
frontend.no        = 3,2,10,8
event.no(collector)= 98335,98336,98337,98338,98339,98333,98334
event.complete     = #00FFF,#00FFF,#00FFF,#00FFF,#00FFF,#00FFF,#00FFF

SOLUTION:
	Unknown. Ulf Behrens has been asked about a possible problem with
the EVB: he says he knows about this effect, doesn't know the reason, and we
have to wait for it to disappear....

/* ------------------------------------------------------------------------- */
DATE    :
	Aug-Sep 1997
PROBLEM DESCR:
	No data, no triggers; sometimes a few, then stops.

SOLUTION:
	In iserver.log saw sometimes in the 'RCAL ROCOLLECT STATUS'

event.no(collector)= 0,   -1,   -9,   -9,   -9,   -9,   -9,   -9
event.complete     = #005FF,#005FF,#00000,#00000,#00000,#00000,#00000

although the boot/readout mask for RCAL was #0DFF, meaning that the crate
with maskbit #0800, RCAL crate #12 was not sending its data.
Just that afternoon the delay-box (for components FNC and PRT in this crate)
was exchanged; it was exchanged back for the old-one and things ran again
(first it was tried by excluding the crate from the boot/readout mask).

/* ------------------------------------------------------------------------- */
DATE    :
	Aug 30 1997
PROBLEM DESCR:
	GFLT complains about fatal error from component CAL.

SOLUTION:
	This had nothing to do with the CALDAQ transputer network, but
was a problem in the connection CFLT-->GFLT;
solved by 'restarting CFLT'.

/* ------------------------------------------------------------------------- */
DATE    :
	Aug 1997
PROBLEM DESCR:
	Errors of the following kind reported in the logfile, with sometimes
hangs of the system, most certainly when the 'UNKNOWN ERROR' occurs:

###LAYER2/3 tp #1213 link 2 SYNC.WORD=#AABB8CDD: BITFLIP ERROR
###LAYER2/3 tp #1213 link 2 SYNC.WORD=#00000000: UNKNOWN ERROR

SOLUTION:
	This is a link communication problem with link 2 of the FCAL LAYER2 #3
transputer (which comes -via the CSB- from FCAL LAYER1 tp in FCAL crate 1
(see CALDAQ network drawing)).
Actions:
	1. switch off/on the crates involved; see if that helps...
	2. unplug/plug the cable from the CSB to the LAYER2 tp,
and unplugging/plugging the LAYER2 2tp-module and the READOUT/TRIGGER
2tp-module.

Only 2. was tried and didn't seem to help, but soon after (after a general
power failure?) the problem was gone, at least for the moment...


/* ------------------------------------------------------------------------- */