Modbus Comm Bug

Forum Home Forums Runtime Bugs Communicator Bugs Modbus Comm Bug

Viewing 15 posts - 1 through 15 (of 21 total)
  • Author
    Posts
  • #7971
    JW
    Participant

    Hi,

    I recently notice some abnormality with the modbus communication.

    I have 6 communication lines.
    4 lines are connected to my modbus slave software at 127.0.0.1.
    each of them has about 30 element group. 50 signals / group.

    Line parameters:
    main parameters | delay after request cycle: 3000,
    request sequence | timeout:1000; delay:0

    https://drive.google.com/file/d/1dwiAdm80DvyIxSlUo5rAR29D0dhfSj5P/view?usp=sharing

    SCADA Comm detects communication error every few hour, in the middle of a request cycle. then the reading of the first half of element group is normal, but the second half of the element group become undefined for a second. then next cycle, they resumed normal.
    https://drive.google.com/file/d/1LEwDlS-gVvI7ufeOx2fJr2tep1MV89Bf/view?usp=sharing

    the modbus slave software said that the connection is closed by remote host (SCADA Comm)

    Error while processing data on socket 724: [WinError 10054] An existing connection was forcibly closed by the remote host

    it looks like scada mis-align the reply from the modbus slave software.
    https://drive.google.com/file/d/11y8e6zyzzKWXJovF0uj8TM22hCG9TJkH/view?usp=sharing

    When I try to set delay to 20 or 50ms, it become even worse. it can never finish a complete request cycle of 50 element groups.

    I write script to stress test my modbus slave software, but didn’t found any problem. will do more test later. (but doesn’t seem to be the modbus slave problem at the moment)

    #7972
    JW
    Participant

    Just check the scadacomm log of other projects, all project with similar settings.

    most project detects a communication line disconnection every few days.
    I disabled detailed log, so it only show disconnection but not the cause.
    I believe they may have the same cause.
    —–
    2021-01-04 06:36:05 Establish a TCP connection with 127.0.0.1:1502
    2021-01-04 10:55:21 Disconnect from 127.0.0.1
    2021-01-04 10:55:22 Establish a TCP connection with 127.0.0.1:1502
    2021-01-04 23:00:03 Disconnect from 127.0.0.1
    2021-01-04 23:00:04 Establish a TCP connection with 127.0.0.1:1502
    2021-01-06 12:28:49 Disconnect from 127.0.0.1
    2021-01-06 12:28:50 Establish a TCP connection with 127.0.0.1:1502
    2021-01-10 12:00:03 Disconnect from 127.0.0.1
    2021-01-10 12:00:04 Establish a TCP connection with 127.0.0.1:1502
    2021-01-11 02:57:53 Disconnect from 127.0.0.1
    2021-01-11 02:57:54 Establish a TCP connection with 127.0.0.1:1502
    ————-

    #7980
    JW
    Participant

    The happening frequency seems pretty random.

    On my test yesterday, it happened a few times per hour. but when I try to repeat this today with the same pc and the same project, it doesn’t happen at all.

    #7981
    JW
    Participant

    I am thinking about the following parameters.

    number of request retry on error: 3

    Currently, scada comm retry with in the same TCP connection.
    when reply from request is mis-aligned, retrying seems meaning less, because the header will never match again. then after 3 times, it set the element groups as undefined, then restart a new TCP session. the comm resumed normal on new TCP connection. but a false alarm already trigged.
    thus this setting doesn’t increase the tolerance for comm error on this case.

    Is it possible to have an other option to retry on new TCP connection, or count the error on failed TCP connection other than request?
    Then either starting from the previous failed element group or starting all over again.
    When the comm fails on 3 connection, then set the failed element groups to undefined.
    This should have more tolerance for comm error.

    StayConneceted: True / False

    by default, it’s set to true.
    when set to false, scada comm use new TCP session for each request cycle.
    But I am not able to test will it change the handling of retry. will it retry on the same TCP connection or new TCP connection? if scada comm retries on new TCP connection , then it should solve my problem. but I guest this parameter does not affect the retry handling.

    #7994
    Mikhail
    Moderator

    Hi,

    To be honest, I think that your network connection is quite stable. It disconnects only once in a few days. It’s normal to break and reconnect sometimes.

    If you set StayConneceted=true, it should work more reliable.

    I suppose that disconnecting is caused by random network errors.

    #7995
    Mikhail
    Moderator

    When I try to set delay to 20 or 50ms, it become even worse. it can never finish a complete request cycle of 50 element groups.

    This is strange. Delay just does Thread.Sleep(50) after receiving an answer from a device.

    #7996
    Mikhail
    Moderator

    Have you thought about splitting the template? It would make a polling session shorter. Short template is better in case of communication errors, because it’s don’t needed to request 49 groups again if the 50th group is failed.

    • This reply was modified 3 years, 9 months ago by Mikhail.
    #7998
    JW
    Participant

    When I try to set delay to 20 or 50ms, it become even worse. it can never finish a complete request cycle of 50 element groups.

    This is strange. Delay just does Thread.Sleep(50) after receiving an answer from a device.

    I try this setting again, but it works fine today. And I can’t reproduce the problem. it maybe not the cause.

    #7999
    JW
    Participant

    Have you thought about splitting the template? It would make a polling session shorter. Short template is better in case of communication errors, because it’s don’t needed to request 49 groups again if the 50th group is failed.

    requesting 30 groups (50 signal/group) works fine most of the time and takes less than 0.1s, since the communication is from localhost(scada) to localhost(another software on same server).

    The main issue I am facing is that, if I enabled Event on Undefined for those sensors, there will be Events / Alarms of undefined channel (disconnected sensor), and Event of resumed normal within next request.

    #8017
    Mikhail
    Moderator

    requesting 30 groups (50 signal/group) works fine most of the time and takes less than 0.1s

    It’s fast!

    if I enabled Event on Undefined for those sensors, there will be Events / Alarms of undefined channel

    A possible solution is development of a formula that changes channel status in 2 steps: Unreliable – Undefined. If connection is repaired on the next request, events will not appear.

    #8035
    JW
    Participant

    Hi Mikhail,

    I made a formula using your approach, which worked for reducing the chance of false Undefined Event.

    But it has a Normal Event when from “Unreliable” to “Normal”. is there any way not to trigger the normal event?

    One way I can think of is to
    1. set one channel n1 using 2step value and stat
    2. set another channel n2 using Val(n1)
    but it’s not convenient to do this if I have lot’s of sensor

    double TwoStepVal()
    {
    if (CnlStat == 0)
    {
    if (Stat() == 5 || Stat() == 0) { return double.NaN;}
    else {return Val();}
    }
    else {return CnlVal;}
    }

    int TwoStepStat()
    {
    if (CnlStat == 0)
    {
    if (Stat() == 5 || Stat() == 0) { return 0;}
    else {return 5;}
    }
    else {return CnlStat;}
    }

    #8048
    Mikhail
    Moderator

    Hi,

    But it has a Normal Event when from “Unreliable” to “Normal”. is there any way not to trigger the normal event?

    Let’s test the approach:
    1. Switch off the channel formula.
    2. Post a link to channel properties screenshot.
    3. Using Generator, send data several times: normal status, unreliable status, undefined status, normal status.
    4. Check events that were created. Send screenshot of events.

    If the above works as expected, I will help with formula.

    #8059
    JW
    Participant

    Hi Mikhail,

    I did 3 tests listed below.

    Currently, when lower/upper alarm limit is set, no matter what previous stat it was (including unreliable and undefine state), as long as it goes into normal stat, it will trigger a normal event.

    I though there were 2 set of separated alarm logic, but don’t know where does unreliable falls into.
    – Event on limit: stat 11/12/13/14/15
    – Event on undefine: 0/1 / (5?)

    But seems actually they are somewhat related.

    1.
    Write Event + Event on Undefined + Lower + Upper Alarm Limit
    normal status, unreliable status, undefined status, normal status
    there are:
    – Undefined event when changed from unreliable to undefined;
    – Normal event when changed from undefined to normal
    https://drive.google.com/file/d/1DaZV7JnzTr0qfDgxGLfRBrs5YH25rmlz/view?usp=sharing

    2.
    Write Event + Event on Undefined + Lower + Upper Alarm Limit
    normal status, unreliable status, normal status
    there is:
    – Normal event when changed from unreliable to normal
    https://drive.google.com/file/d/1kYIYr07C0QjAZkhsf3iOQbPMWJaSlyD5/view?usp=sharing

    3.
    Write Event + Event on Undefined
    normal status, unreliable status, normal status
    there is:
    – No event
    https://drive.google.com/file/d/1pm_0ebgHjRSWiU_L25jLD2YnuQCaECiz/view?usp=sharing

    #8066
    Mikhail
    Moderator

    Hi,

    The 3rd screenshot is not shared.
    I forget about behavior with limits. What we can do is using a collection to store extra status instead of Unreliable status:

    public Dictionary<int, bool> ChannelErrors = new Dictionary<int, bool>();

    #8071
    JW
    Participant

    Thanks Mikhail, the problem is solved. the formula below can keep the result until n consecutive errors.

    public Dictionary<int, int> ChannelErrors = new Dictionary<int, int>();
    
    double NStepVal(int n)
    {
    if (!(ChannelErrors.ContainsKey(CnlNum))) {ChannelErrors[CnlNum] = 0;}
    if (CnlStat == 0) 
    {
    if (ChannelErrors[CnlNum] >= n) {return double.NaN;}
    else {return Val();}
    }
    else {return CnlVal;}
    }
    
    int NStepStat(int n)
    {
    if (!(ChannelErrors.ContainsKey(CnlNum))) {ChannelErrors[CnlNum] = 0;}
    if (CnlStat == 0) 
    {
    if (ChannelErrors[CnlNum] >= n) {ChannelErrors[CnlNum] = 0;  return 0;}
    else {ChannelErrors[CnlNum] = ChannelErrors[CnlNum] +1; return 1;}
    }
    else {ChannelErrors[CnlNum] = 0;  return CnlStat;}
    }
Viewing 15 posts - 1 through 15 (of 21 total)
  • You must be logged in to reply to this topic.