To help troubleshooting some grafana helper dashboards have been created in the DCS grafana folder

The DCS Error Dashboard by default displays ALL the errors that area captured by DCS in a time range that can be tuned. There are a series of tabs that allow to select elements, zone, type of errors...

The same Dashboard contains statistical information to understand the recurrent errors and their source.


Magnet serial timeout 


When a serial timeout arise it means that there is a problem in the communication between the devil and the magnet device. The actions to perform in this case are:

1 - Assure the magnet is ON and is in "remote" (controllable by remote)

2 - Access the DCS Element dashboard by selecting the 'Controlled element' that is in serial timeout the dashboard will return the devil number and many other information like the additional devices involved in the communication (like moxa and ports).

3- With the devil number access the Devil Manager and click the corresponding devil number to open a vnc session with the labview try to stop and start the devil. If the labview window is not present means the devil is crashed and you need to click reset button in the Devil Manager


If the procedure does not work and the serial timeout error is issued again in the  DCS Error Dashboard and the magnet cannot be controlled by the magterminal something HW is happening and need a deep diagnosis.

Here below and example of 'Controlled Element' DHPTB101 that returns devil 654, and putting 654 in the 'Devil' selector we obtain the controlling moxa:


deep diagnosis

The DCS Element dashboard shows if there is some moxa involved in the connection. If this is the case a check of the full SW/HW stack must be performed.

Until the diagnosis is not complete or the moxa is not accessible DONT RESTART THE MOXA.

  1. check the integrity of the cabling from the port of the moxa (found in DCS Element dashboard ) to the magnet interface.
  2. With the devil running check if the TX led of the port is blinking (the devil tries to access the hw).
    1. If not, you must check if the devil is running and Moxa serial lines documentation is aligned with the DCS Element dashboard.
    2. access to the Moxa web interface (Login info and Passwords) and navigate in Monitor→async and look the debug information of the relative port (see the picture below). Here there are the byte sent by the devil (TX) and by magnet (RX). If the communication between the devil and the moxa is ok the TX number should increase. Then if the communication between the magnet and the moxa is ok the RX should increase as well. Remember the page is refreshed each time a click on 'Async' is performed. Each time the RX/TX led blinks the counter should increase.
  3. If the RX led of the corresponding port is not blinking means that no data are coming from magnet
  4. To quickly test full stack HW/SW till the magnet use a loopback cap (il tappo) that connects TX with RX. You should observe TX/RX blinking and the counters of the moxa interface increasing. If this is the case means the serial interface of the magnet must be repaired or replaced. Otherwise there is some misconfiguration that requires further investigations.



DCS Troubleshooting Dashboards


DCS elements

Information dashboard with elements, devil, moxa.

DCS Elements


DCS Errors

The history of errors with filters

Error History

DCS Dante State

Information about devil status (live, errors)

Dante State


DCS Command History

The history of the commands sent to elements. 

Command History

Alert

Page that visualize the status of all the defined alerts.

Alerts Status










  • No labels