Forum Home › Forums › Runtime Bugs › Communicator Bugs › Modbus Comm Bug
- This topic has 20 replies, 2 voices, and was last updated 2 years, 10 months ago by Mikhail.
-
AuthorPosts
-
January 11, 2021 at 4:36 pm #7971JWParticipant
Hi,
I recently notice some abnormality with the modbus communication.
I have 6 communication lines.
4 lines are connected to my modbus slave software at 127.0.0.1.
each of them has about 30 element group. 50 signals / group.Line parameters:
main parameters | delay after request cycle: 3000,
request sequence | timeout:1000; delay:0https://drive.google.com/file/d/1dwiAdm80DvyIxSlUo5rAR29D0dhfSj5P/view?usp=sharing
SCADA Comm detects communication error every few hour, in the middle of a request cycle. then the reading of the first half of element group is normal, but the second half of the element group become undefined for a second. then next cycle, they resumed normal.
https://drive.google.com/file/d/1LEwDlS-gVvI7ufeOx2fJr2tep1MV89Bf/view?usp=sharingthe modbus slave software said that the connection is closed by remote host (SCADA Comm)
—
Error while processing data on socket 724: [WinError 10054] An existing connection was forcibly closed by the remote host
—it looks like scada mis-align the reply from the modbus slave software.
https://drive.google.com/file/d/11y8e6zyzzKWXJovF0uj8TM22hCG9TJkH/view?usp=sharingWhen I try to set delay to 20 or 50ms, it become even worse. it can never finish a complete request cycle of 50 element groups.
I write script to stress test my modbus slave software, but didn’t found any problem. will do more test later. (but doesn’t seem to be the modbus slave problem at the moment)
January 11, 2021 at 4:55 pm #7972JWParticipantJust check the scadacomm log of other projects, all project with similar settings.
most project detects a communication line disconnection every few days.
I disabled detailed log, so it only show disconnection but not the cause.
I believe they may have the same cause.
—–
2021-01-04 06:36:05 Establish a TCP connection with 127.0.0.1:1502
2021-01-04 10:55:21 Disconnect from 127.0.0.1
2021-01-04 10:55:22 Establish a TCP connection with 127.0.0.1:1502
2021-01-04 23:00:03 Disconnect from 127.0.0.1
2021-01-04 23:00:04 Establish a TCP connection with 127.0.0.1:1502
2021-01-06 12:28:49 Disconnect from 127.0.0.1
2021-01-06 12:28:50 Establish a TCP connection with 127.0.0.1:1502
2021-01-10 12:00:03 Disconnect from 127.0.0.1
2021-01-10 12:00:04 Establish a TCP connection with 127.0.0.1:1502
2021-01-11 02:57:53 Disconnect from 127.0.0.1
2021-01-11 02:57:54 Establish a TCP connection with 127.0.0.1:1502
————-January 12, 2021 at 8:30 am #7980JWParticipantThe happening frequency seems pretty random.
On my test yesterday, it happened a few times per hour. but when I try to repeat this today with the same pc and the same project, it doesn’t happen at all.
January 12, 2021 at 9:05 am #7981JWParticipantI am thinking about the following parameters.
number of request retry on error: 3
Currently, scada comm retry with in the same TCP connection.
when reply from request is mis-aligned, retrying seems meaning less, because the header will never match again. then after 3 times, it set the element groups as undefined, then restart a new TCP session. the comm resumed normal on new TCP connection. but a false alarm already trigged.
thus this setting doesn’t increase the tolerance for comm error on this case.Is it possible to have an other option to retry on new TCP connection, or count the error on failed TCP connection other than request?
Then either starting from the previous failed element group or starting all over again.
When the comm fails on 3 connection, then set the failed element groups to undefined.
This should have more tolerance for comm error.StayConneceted: True / False
by default, it’s set to true.
when set to false, scada comm use new TCP session for each request cycle.
But I am not able to test will it change the handling of retry. will it retry on the same TCP connection or new TCP connection? if scada comm retries on new TCP connection , then it should solve my problem. but I guest this parameter does not affect the retry handling.January 12, 2021 at 2:35 pm #7994MikhailModeratorHi,
To be honest, I think that your network connection is quite stable. It disconnects only once in a few days. It’s normal to break and reconnect sometimes.
If you set StayConneceted=true, it should work more reliable.
I suppose that disconnecting is caused by random network errors.
January 12, 2021 at 2:37 pm #7995MikhailModeratorWhen I try to set delay to 20 or 50ms, it become even worse. it can never finish a complete request cycle of 50 element groups.
This is strange. Delay just does Thread.Sleep(50) after receiving an answer from a device.
January 12, 2021 at 2:39 pm #7996MikhailModeratorHave you thought about splitting the template? It would make a polling session shorter. Short template is better in case of communication errors, because it’s don’t needed to request 49 groups again if the 50th group is failed.
- This reply was modified 3 years, 9 months ago by Mikhail.
January 12, 2021 at 4:30 pm #7998JWParticipantWhen I try to set delay to 20 or 50ms, it become even worse. it can never finish a complete request cycle of 50 element groups.
This is strange. Delay just does Thread.Sleep(50) after receiving an answer from a device.
I try this setting again, but it works fine today. And I can’t reproduce the problem. it maybe not the cause.
January 12, 2021 at 4:44 pm #7999JWParticipantHave you thought about splitting the template? It would make a polling session shorter. Short template is better in case of communication errors, because it’s don’t needed to request 49 groups again if the 50th group is failed.
requesting 30 groups (50 signal/group) works fine most of the time and takes less than 0.1s, since the communication is from localhost(scada) to localhost(another software on same server).
The main issue I am facing is that, if I enabled Event on Undefined for those sensors, there will be Events / Alarms of undefined channel (disconnected sensor), and Event of resumed normal within next request.
January 13, 2021 at 2:02 pm #8017MikhailModeratorrequesting 30 groups (50 signal/group) works fine most of the time and takes less than 0.1s
It’s fast!
if I enabled Event on Undefined for those sensors, there will be Events / Alarms of undefined channel
A possible solution is development of a formula that changes channel status in 2 steps: Unreliable – Undefined. If connection is repaired on the next request, events will not appear.
January 15, 2021 at 5:26 pm #8035JWParticipantHi Mikhail,
I made a formula using your approach, which worked for reducing the chance of false Undefined Event.
But it has a Normal Event when from “Unreliable” to “Normal”. is there any way not to trigger the normal event?
One way I can think of is to
1. set one channel n1 using 2step value and stat
2. set another channel n2 using Val(n1)
but it’s not convenient to do this if I have lot’s of sensordouble TwoStepVal()
{
if (CnlStat == 0)
{
if (Stat() == 5 || Stat() == 0) { return double.NaN;}
else {return Val();}
}
else {return CnlVal;}
}int TwoStepStat()
{
if (CnlStat == 0)
{
if (Stat() == 5 || Stat() == 0) { return 0;}
else {return 5;}
}
else {return CnlStat;}
}January 17, 2021 at 8:39 am #8048MikhailModeratorHi,
But it has a Normal Event when from “Unreliable” to “Normal”. is there any way not to trigger the normal event?
Let’s test the approach:
1. Switch off the channel formula.
2. Post a link to channel properties screenshot.
3. Using Generator, send data several times: normal status, unreliable status, undefined status, normal status.
4. Check events that were created. Send screenshot of events.If the above works as expected, I will help with formula.
January 18, 2021 at 8:01 am #8059JWParticipantHi Mikhail,
I did 3 tests listed below.
Currently, when lower/upper alarm limit is set, no matter what previous stat it was (including unreliable and undefine state), as long as it goes into normal stat, it will trigger a normal event.
I though there were 2 set of separated alarm logic, but don’t know where does unreliable falls into.
– Event on limit: stat 11/12/13/14/15
– Event on undefine: 0/1 / (5?)But seems actually they are somewhat related.
1.
Write Event + Event on Undefined + Lower + Upper Alarm Limit
normal status, unreliable status, undefined status, normal status
there are:
– Undefined event when changed from unreliable to undefined;
– Normal event when changed from undefined to normal
https://drive.google.com/file/d/1DaZV7JnzTr0qfDgxGLfRBrs5YH25rmlz/view?usp=sharing2.
Write Event + Event on Undefined + Lower + Upper Alarm Limit
normal status, unreliable status, normal status
there is:
– Normal event when changed from unreliable to normal
https://drive.google.com/file/d/1kYIYr07C0QjAZkhsf3iOQbPMWJaSlyD5/view?usp=sharing3.
Write Event + Event on Undefined
normal status, unreliable status, normal status
there is:
– No event
https://drive.google.com/file/d/1pm_0ebgHjRSWiU_L25jLD2YnuQCaECiz/view?usp=sharingJanuary 18, 2021 at 9:52 am #8066MikhailModeratorHi,
The 3rd screenshot is not shared.
I forget about behavior with limits. What we can do is using a collection to store extra status instead of Unreliable status:public Dictionary<int, bool> ChannelErrors = new Dictionary<int, bool>();
January 18, 2021 at 2:48 pm #8071JWParticipantThanks Mikhail, the problem is solved. the formula below can keep the result until n consecutive errors.
public Dictionary<int, int> ChannelErrors = new Dictionary<int, int>(); double NStepVal(int n) { if (!(ChannelErrors.ContainsKey(CnlNum))) {ChannelErrors[CnlNum] = 0;} if (CnlStat == 0) { if (ChannelErrors[CnlNum] >= n) {return double.NaN;} else {return Val();} } else {return CnlVal;} } int NStepStat(int n) { if (!(ChannelErrors.ContainsKey(CnlNum))) {ChannelErrors[CnlNum] = 0;} if (CnlStat == 0) { if (ChannelErrors[CnlNum] >= n) {ChannelErrors[CnlNum] = 0; return 0;} else {ChannelErrors[CnlNum] = ChannelErrors[CnlNum] +1; return 1;} } else {ChannelErrors[CnlNum] = 0; return CnlStat;} }
-
AuthorPosts
- You must be logged in to reply to this topic.