Bug #5563
openOsmoMSC sometimes stalls for dozens of seconds in a production deployment
50%
Description
When we take a long-term (8 hours) bpftrace showing us the delay between subsequent calls to poll()
(by libosmocore/src/select.c) in osmo-msc, we get the following histogram (units in milli-seconds):
@poll: [0] 532245 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [1] 13088 |@ | [2, 4) 5621 | | [4, 8) 5566 | | [8, 16) 2746 | | [16, 32) 5282 | | [32, 64) 5262 | | [64, 128) 6139 | | [128, 256) 14273 |@ | [256, 512) 18357 |@ | [512, 1K) 13806 |@ | [1K, 2K) 4222 | | [2K, 4K) 1331 | | [4K, 8K) 450 | | [8K, 16K) 0 | | [16K, 32K) 0 | | [32K, 64K) 5 | | [64K, 128K) 17 | | [128K, 256K) 2 | |So as we can see
- the majority is very low (sub-second to 128ms)
- there is a smaller peak in the order of 128ms to 1s (surprisingly long)
- there are still several thousand of instances where the delay isn the 1s..4s. interval (too long!)
- there ar rare occasions where we don't return to poll for 32, 64, or evne more than 128 seconds (crazy!)
If we contrast this with the amount of time we spent in dbi_conn_queryf
, this is clearly not the culprit:
@dbi_query: [0] 37008 |@ | [1] 1640233 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [2, 4) 1245771 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [4, 8) 21406 | | [8, 16) 325 | | [16, 32) 71 | | [32, 64) 17 | |
So the longest duration DB query was in the order of 32..63 ms. Not good, but not a problem either with all the MSC (MM, CC, SMS, BSSAP, SCCP, ...) time-outs being in the multiple-second range.
So now we have to find out if the stalls are
- due to excessive system load (like I/O) outside of osmo-msc, or
- due to something osmo-msc is doing by itself (like calling thousands of database queries of several milli-seconds each) without going through the libosmocore poll main loop.
Related issues
Updated by laforge about 2 years ago
The result of all of this is long stalling is (of course) that all of the timers time out, up to the point where the BSC detects that the MSC is gone and it starts dropping new subscriber connections due to MSC inavailability:
May 13 23:59:20 huautla-bsc osmo-bsc[5253]: DRSL ERROR gsm_08_08.c:473 MM GSM48_MT_MM_LOC_UPD_REQUEST: TMSI-0x28282D3B: No suitable MSC for this Complete Layer 3 request found May 13 23:59:20 huautla-bsc osmo-bsc[5253]: DRSL ERROR gsm_08_08.c:473 MM GSM48_MT_MM_LOC_UPD_REQUEST: TMSI-0x60BE0328: No suitable MSC for this Complete Layer 3 request found May 13 23:59:21 huautla-bsc osmo-bsc[5253]: DRSL ERROR gsm_08_08.c:473 MM GSM48_MT_MM_LOC_UPD_REQUEST: TMSI-0x611E7F66: No suitable MSC for this Complete Layer 3 request found May 13 23:59:21 huautla-bsc osmo-bsc[5253]: DRSL ERROR gsm_08_08.c:473 MM GSM48_MT_MM_LOC_UPD_REQUEST: TMSI-0x5D9F98F2: No suitable MSC for this Complete Layer 3 request found May 13 23:59:21 huautla-bsc osmo-bsc[5253]: DRSL ERROR gsm_08_08.c:473 MM GSM48_MT_MM_LOC_UPD_REQUEST: TMSI-0xA8A19D59: No suitable MSC for this Complete Layer 3 request found May 13 23:59:21 huautla-bsc osmo-bsc[5253]: DRSL ERROR gsm_08_08.c:473 MM GSM48_MT_MM_LOC_UPD_REQUEST: TMSI-0x02376F59: No suitable MSC for this Complete Layer 3 request foundThe logic for this detection is in a_reset.c:
- every time a SCCP CR is not responded to for 20s: increment a counter
- every time a SCCP CR is positively acknowledged: set the counter to zero
- if the counter reaches 3: declare the MSC as dead
So if the MSC stalls for anything close to 20s or more, the above condition will be met. Likewise, of course, many timers in other protocols run out. This simply must not happen.
Updated by laforge about 2 years ago
for the record, this is the bpftrace script used to generate the above histograms:
BEGIN { @last_elapsed = nsecs; } tracepoint:syscalls:sys_enter_poll /comm == "osmo-msc"/ { $since_last = (nsecs - @last_elapsed)/1000000; @last_elapsed = nsecs; @poll = hist($since_last); } uprobe:/usr/lib/x86_64-linux-gnu/libdbi.so.1.1.0:dbi_conn_queryf { @query_start = nsecs; } uretprobe:/usr/lib/x86_64-linux-gnu/libdbi.so.1.1.0:dbi_conn_queryf { $query_duration = (nsecs - @query_start)/1000000; @dbi_query = hist($query_duration); } interval:s:10 { printf("=========================================================="); time("%Y-%m-%d %H:%M:%S\n"); print(@poll); print(@dbi_query); }
Updated by laforge about 2 years ago
- Category set to SMS
- Status changed from New to In Progress
The culprit is likely sms_submit_pending()
where we do up to 1000 iterations of a loop that performs a database read operation. So if the database takes 4ms each, that means 4s of stalling - and of course more if the DB operations take longer.
Updated by laforge about 2 years ago
laforge wrote in #note-3:
The culprit is likely
sms_submit_pending()
where we do up to 1000 iterations of a loop that performs a database read operation. So if the database takes 4ms each, that means 4s of stalling - and of course more if the DB operations take longer.
I've now added the following snipped to the bpftrace so we know how long sms_submit_pending takes:
uprobe:/usr/bin/osmo-msc:sms_submit_pending { @sms_submit_pending_start = nsecs; } uretprobe:/usr/bin/osmo-msc:sms_submit_pending { $duration = (nsecs - @sms_submit_pending_start)/1000000; @sms_submit_pending = hist($duration); }
Updated by laforge about 2 years ago
"Unfortunately" it doesn't seem to be sms_submit_pending, at least not from what I can see so far:
==========================================================2022-05-14 06:35:10 @poll: [0] 59773 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [1] 550 | | [2, 4) 96 | | [4, 8) 171 | | [8, 16) 181 | | [16, 32) 239 | | [32, 64) 290 | | [64, 128) 276 | | [128, 256) 895 | | [256, 512) 1956 |@ | [512, 1K) 2010 |@ | [1K, 2K) 952 | | [2K, 4K) 1098 | | [4K, 8K) 1069 | | @dbi_query: [0] 736 |@ | [1] 32393 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [2, 4) 5356 |@@@@@@@@ | [4, 8) 328 | | @sms_submit_pending: [128, 256) 89 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [256, 512) 150 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
So we have [again] had 1096 times where the time between to successive poll() executions was > 4s, but at the same time the maximum duration of sms_submit_pending was in the 256..512ms bucket.
One possible error is that our '@poll' histogram represents just the absolute time between polls, irrespective of whether or not there actually was any timeout that would have constrained that poll. So assuming no osmo_timers are expiring and no network traffic happens, the value can of course go into the seconds range.
I've now added a '@exit_to_poll' histogram, which masures the time between exit of the last poll until we re-renter poll, and therefore it is a good measure for how much time we spend in osmo-msc code (potentially doing blocking file I/O via sqlite3) before again processing the next I/O event via poll.
total bpftrace script running now:
BEGIN { @last_elapsed = nsecs; @last_poll_exit = nsecs; } tracepoint:syscalls:sys_enter_poll /comm == "osmo-msc"/ { /* time since last poll */ $since_last = (nsecs - @last_elapsed)/1000000; @last_elapsed = nsecs; @poll = hist($since_last); /* time since last poll exit (time spent in osmo-msc) */ $since_exit = (nsecs - @last_poll_exit)/1000000; @exit_to_poll = hist($since_exit); } tracepoint:syscalls:sys_exit_poll /comm == "osmo-msc"/ { @last_poll_exit = nsecs; } uprobe:/usr/lib/x86_64-linux-gnu/libdbi.so.1.1.0:dbi_conn_queryf { @query_start = nsecs; } uretprobe:/usr/lib/x86_64-linux-gnu/libdbi.so.1.1.0:dbi_conn_queryf { $query_duration = (nsecs - @query_start)/1000000; @dbi_query = hist($query_duration); } uprobe:/usr/bin/osmo-msc:sms_submit_pending { @sms_submit_pending_start = nsecs; } uretprobe:/usr/bin/osmo-msc:sms_submit_pending { $duration = (nsecs - @sms_submit_pending_start)/1000000; @sms_submit_pending = hist($duration); } interval:s:10 { printf("=========================================================="); time("%Y-%m-%d %H:%M:%S\n"); print(@poll); print(@exit_to_poll); print(@dbi_query); print(@sms_submit_pending); }
Updated by laforge about 2 years ago
Running under low load for jus a few minutes, I can see the following (expected) correlation:
@exit_to_poll: [0] 2936 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [1] 38 | | [2, 4) 28 | | [4, 8) 69 |@ | [8, 16) 0 | | [16, 32) 0 | | [32, 64) 0 | | [64, 128) 0 | | [128, 256) 18 | | [256, 512) 36 | | @sms_submit_pending: [128, 256) 20 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [256, 512) 34 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
The two longet-duration buckets of 'exit_to_poll' rather nicely correspond with the two (existing at all) buckets for sms_submit_pending.
So let's wait until SMS load increases during daytime, how far both of these will increment.
Updated by laforge about 2 years ago
We just witnessed one such event, where sms_submit_pending took more than 64 seconds (!):
Until 07:04:05am local time everything was fine:
==========================================================2022-05-14 07:04:05 @poll: [0] 10822 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [1] 306 |@ | [2, 4) 62 | | [4, 8) 123 | | [8, 16) 65 | | [16, 32) 163 | | [32, 64) 123 | | [64, 128) 132 | | [128, 256) 365 |@ | [256, 512) 479 |@@ | [512, 1K) 439 |@@ | [1K, 2K) 180 | | [2K, 4K) 44 | | [4K, 8K) 8 | | @exit_to_poll: [0] 12540 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [1] 183 | | [2, 4) 85 | | [4, 8) 287 |@ | [8, 16) 3 | | [16, 32) 1 | | [32, 64) 0 | | [64, 128) 0 | | [128, 256) 36 | | [256, 512) 176 | | @dbi_query: [0] 674 | | [1] 57332 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [2, 4) 11394 |@@@@@@@@@@ | [4, 8) 299 | | [8, 16) 0 | | [16, 32) 1 | | @sms_submit_pending: [128, 256) 39 |@@@@@@@@@@@ | [256, 512) 173 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
But then, 10s later, we saw this:
==========================================================2022-05-14 07:04:15 @poll: [0] 10906 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [1] 307 |@ | [2, 4) 68 | | [4, 8) 123 | | [8, 16) 65 | | [16, 32) 163 | | [32, 64) 123 | | [64, 128) 134 | | [128, 256) 366 |@ | [256, 512) 479 |@@ | [512, 1K) 439 |@@ | [1K, 2K) 180 | | [2K, 4K) 44 | | [4K, 8K) 8 | | [8K, 16K) 0 | | [16K, 32K) 0 | | [32K, 64K) 0 | | [64K, 128K) 1 | | @exit_to_poll: [0] 12626 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [1] 183 | | [2, 4) 91 | | [4, 8) 289 |@ | [8, 16) 3 | | [16, 32) 1 | | [32, 64) 0 | | [64, 128) 0 | | [128, 256) 36 | | [256, 512) 176 | | [512, 1K) 0 | | [1K, 2K) 0 | | [2K, 4K) 0 | | [4K, 8K) 0 | | [8K, 16K) 0 | | [16K, 32K) 0 | | [32K, 64K) 0 | | [64K, 128K) 1 | | @dbi_query: [0] 680 | | [1] 62473 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [2, 4) 11517 |@@@@@@@@@ | [4, 8) 299 | | [8, 16) 0 | | [16, 32) 1 | | @sms_submit_pending: [128, 256) 39 |@@@@@@@@@@@ | [256, 512) 173 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [512, 1K) 0 | | [1K, 2K) 0 | | [2K, 4K) 0 | | [4K, 8K) 0 | | [8K, 16K) 0 | | [16K, 32K) 0 | | [32K, 64K) 0 | | [64K, 128K) 1 | |
So sms_submit_pending took 64..127s and that of course also now shows up in exit_to_poll and poll. So I think we've established evidences that sms_submit_pending is at least one culprit.
I guess the best option would be to not process 1000 database queries at onec but simply only do 1..10, keep the statement / cursor and then continue in the next iteration.
Updated by laforge about 2 years ago
logs around that time:
May 14 07:03:00 huautla-bsc osmo-msc[29081]: DLSMS INFO gsm0411_smc.c:575 SMC(0) message MMSMS-EST-CNF received in state MM_CONN_PENDING May 14 07:03:00 huautla-bsc osmo-msc[29081]: DLSMS INFO gsm0411_smc.c:189 SMC(0) send CP data May 14 07:03:00 huautla-bsc osmo-msc[29081]: DLSMS INFO gsm0411_smc.c:142 SMC(0) new CP state MM_CONN_PENDING -> WAIT_CP_ACK May 14 07:03:00 huautla-bsc osmo-msc[29081]: DLSMS INFO gsm0411_smc.c:96 SMC(0) instance created for network May 14 07:03:00 huautla-bsc osmo-msc[29081]: DLSMS INFO gsm0411_smr.c:92 SMR(0) instance created for network. May 14 07:03:00 huautla-bsc osmo-msc[29081]: DLSMS INFO gsm_04_11.c:1134 trans(SMS IMSI-334020504656624:MSISDN-69610126530:TMSI-0x1DC26424 callref-0x400004a3 tid-0) Going to send a MT SMS May 14 07:03:00 huautla-bsc osmo-msc[29081]: DLSMS INFO gsm0411_smr.c:422 SMR(0) message SM-RL-DATA_REQ received in state IDLE May 14 07:03:00 huautla-bsc osmo-msc[29081]: DLSMS INFO gsm0411_smr.c:146 SMR(0) new RP state IDLE -> WAIT_FOR_RP_ACK May 14 07:03:00 huautla-bsc osmo-msc[29081]: DLSMS INFO gsm0411_smc.c:475 SMC(0) message MNSMS-EST-REQ received in state IDLE May 14 07:03:00 huautla-bsc osmo-msc[29081]: DLSMS INFO gsm0411_smc.c:142 SMC(0) new CP state IDLE -> MM_CONN_PENDING May 14 07:03:00 huautla-bsc osmo-msc[29081]: DLSMS INFO gsm0411_smc.c:96 SMC(0) instance created for network May 14 07:03:00 huautla-bsc osmo-msc[29081]: DLSMS INFO gsm0411_smr.c:92 SMR(0) instance created for network. May 14 07:03:00 huautla-bsc osmo-msc[29081]: DLSMS INFO gsm_04_11.c:1134 trans(SMS IMSI-334020333482858:MSISDN-69610145942:TMSI-0x21200B95 callref-0x400004a4 tid-0) Going to send a MT SMS May 14 07:03:00 huautla-bsc osmo-msc[29081]: DLSMS INFO gsm_04_11.c:408 trans(SMS IMSI-334020333482858:MSISDN-69610145942:TMSI-0x21200B95 callref-0x400004a4 tid-0) sending status report for SMS reference 45 May 14 07:03:00 huautla-bsc osmo-msc[29081]: DLSMS INFO gsm0411_smr.c:422 SMR(0) message SM-RL-DATA_REQ received in state IDLE May 14 07:03:00 huautla-bsc osmo-msc[29081]: DLSMS INFO gsm0411_smr.c:146 SMR(0) new RP state IDLE -> WAIT_FOR_RP_ACK May 14 07:03:00 huautla-bsc osmo-msc[29081]: DLSMS INFO gsm0411_smc.c:475 SMC(0) message MNSMS-EST-REQ received in state IDLE May 14 07:03:00 huautla-bsc osmo-msc[29081]: DLSMS INFO gsm0411_smc.c:142 SMC(0) new CP state IDLE -> MM_CONN_PENDING May 14 07:03:01 huautla-bsc osmo-bsc[5253]: DCHAN NOTICE abis_rsl.c:2947 lchan(1-0-4-TCH_H-1)[0x5582ef9a05f0]{WAIT_RF_RELEASE_ACK}: (type=TCH_H) Rx IPACC DLCX IND (cause=Normal event, unspecified [ 0f ]) May 14 07:03:01 huautla-bsc osmo-bsc[5253]: DCHAN NOTICE abis_rsl.c:2947 lchan(1-0-3-TCH_H-0)[0x5582ef9de120]{WAIT_RF_RELEASE_ACK}: (type=TCH_H) Rx IPACC DLCX IND (cause=Normal event, unspecified [ 0f ]) May 14 07:03:01 huautla-bsc CRON[14949]: pam_unix(cron:session): session opened for user root by (uid=0) May 14 07:03:01 huautla-bsc CRON[14950]: (root) CMD (/var/rhizomatica/bin/network_update60.sh) May 14 07:03:01 huautla-bsc CRON[14949]: pam_unix(cron:session): session closed for user root May 14 07:03:02 huautla-bsc ssh[2713]: debug1: client_input_global_request: rtype keepalive@openssh.com want_reply 1 May 14 07:03:14 huautla-bsc osmo-bsc[5253]: DMSC ERROR gsm_08_08.c:527 SUBSCR_CONN(msc0-conn32891_subscr-IMSI-334020240287401-TMSI-0x0323b238)[0x5582efb68ff0]{WAIT_CLEAR_CMD}: Event MO_COMPL_L3 not permitted May 14 07:03:14 huautla-bsc osmo-bsc[5253]: DLINP ERROR input/ipaccess.c:432 Bad signalling message, sign_link returned error: Operation not permitted. May 14 07:03:27 huautla-bsc osmo-bsc[5253]: DMSC ERROR gsm_08_08.c:527 SUBSCR_CONN(msc0-conn32891_subscr-IMSI-334020240287401-TMSI-0x0323b238)[0x5582efb68ff0]{WAIT_CLEAR_CMD}: Event MO_COMPL_L3 not permitted May 14 07:03:27 huautla-bsc osmo-bsc[5253]: DLINP ERROR input/ipaccess.c:432 Bad signalling message, sign_link returned error: Operation not permitted. May 14 07:03:32 huautla-bsc ssh[2713]: debug1: client_input_global_request: rtype keepalive@openssh.com want_reply 1 May 14 07:03:37 huautla-bsc dhclient[546]: DHCPREQUEST for 192.168.1.107 on eth0 to 192.168.1.1 port 67 May 14 07:03:37 huautla-bsc dhclient[546]: DHCPACK of 192.168.1.107 from 192.168.1.1 May 14 07:03:37 huautla-bsc dnsmasq[1076]: reading /etc/resolv.conf May 14 07:03:37 huautla-bsc dnsmasq[825]: reading /etc/resolv.conf May 14 07:03:37 huautla-bsc dnsmasq[825]: using nameserver 192.168.1.1#53 May 14 07:03:37 huautla-bsc dnsmasq[1076]: using nameserver 192.168.1.1#53 May 14 07:03:37 huautla-bsc dhclient[546]: bound to 192.168.1.107 -- renewal in 220 seconds. May 14 07:03:46 huautla-bsc osmo-bsc[5253]: DRLL NOTICE lchan_select.c:305 (bts=1) Failed to select SDCCH channel May 14 07:03:46 huautla-bsc osmo-bsc[5253]: DRSL NOTICE abis_rsl.c:2199 (bts=1) CHAN RQD[other]: no resources for SDCCH 0x13, retrying with TCH_H May 14 07:03:47 huautla-bsc osmo-bsc[5253]: DMSC ERROR gsm_08_08.c:555 SUBSCR_CONN(msc0-conn32899_subscr-TMSI-0x110b4690)[0x5582efa57de0]{WAIT_CC}: Event MO_DTAP not permitted May 14 07:03:48 huautla-bsc osmo-bsc[5253]: DRLL NOTICE lchan_select.c:305 (bts=1) Failed to select SDCCH channel May 14 07:03:48 huautla-bsc osmo-bsc[5253]: DRSL NOTICE abis_rsl.c:2199 (bts=1) CHAN RQD[other]: no resources for SDCCH 0x16, retrying with TCH_H May 14 07:03:49 huautla-bsc osmo-bsc[5253]: DRLL NOTICE lchan_select.c:305 (bts=1) Failed to select SDCCH channel May 14 07:03:49 huautla-bsc osmo-bsc[5253]: DRSL NOTICE abis_rsl.c:2199 (bts=1) CHAN RQD[Location updating]: no resources for SDCCH 0x6, retrying with TCH_H May 14 07:03:51 huautla-bsc osmo-bsc[5253]: DRLL NOTICE lchan_select.c:305 (bts=1) Failed to select SDCCH channel May 14 07:03:51 huautla-bsc osmo-bsc[5253]: DRSL NOTICE abis_rsl.c:2199 (bts=1) CHAN RQD[Location updating]: no resources for SDCCH 0xf, retrying with TCH_H May 14 07:03:53 huautla-bsc osmo-bsc[5253]: DRLL NOTICE lchan_select.c:305 (bts=1) Failed to select SDCCH channel May 14 07:03:53 huautla-bsc osmo-bsc[5253]: DRSL NOTICE abis_rsl.c:2199 (bts=1) CHAN RQD[Location updating]: no resources for SDCCH 0x7, retrying with TCH_H May 14 07:03:55 huautla-bsc osmo-bsc[5253]: DRESET NOTICE bssmap_reset.c:91 bssmap_reset(msc-0)[0x5582ef9cdce0]{DISC}: link lost May 14 07:03:55 huautla-bsc osmo-bsc[5253]: DMSC NOTICE a_reset.c:54 (msc0) BSSMAP assocation is down May 14 07:03:55 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1341 endpoint:rtpbridge/15@mgw DLCX: deleting connection(s) ... May 14 07:03:55 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1458 endpoint:rtpbridge/15@mgw DLCX: connection successfully deleted May 14 07:03:55 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1341 endpoint:rtpbridge/15@mgw DLCX: deleting connection(s) ... May 14 07:03:55 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1458 endpoint:rtpbridge/15@mgw DLCX: connection successfully deleted May 14 07:03:55 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1341 endpoint:rtpbridge/9@mgw DLCX: deleting connection(s) ... May 14 07:03:55 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1458 endpoint:rtpbridge/9@mgw DLCX: connection successfully deleted May 14 07:03:55 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1341 endpoint:rtpbridge/9@mgw DLCX: deleting connection(s) ... May 14 07:03:55 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1458 endpoint:rtpbridge/9@mgw DLCX: connection successfully deleted May 14 07:03:55 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1341 endpoint:rtpbridge/7@mgw DLCX: deleting connection(s) ... May 14 07:03:55 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1458 endpoint:rtpbridge/7@mgw DLCX: connection successfully deleted May 14 07:03:55 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1341 endpoint:rtpbridge/7@mgw DLCX: deleting connection(s) ... May 14 07:03:55 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1458 endpoint:rtpbridge/7@mgw DLCX: connection successfully deleted May 14 07:03:55 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1341 endpoint:rtpbridge/f@mgw DLCX: deleting connection(s) ... May 14 07:03:55 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1458 endpoint:rtpbridge/f@mgw DLCX: connection successfully deleted May 14 07:03:55 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1341 endpoint:rtpbridge/f@mgw DLCX: deleting connection(s) ... May 14 07:03:55 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1458 endpoint:rtpbridge/f@mgw DLCX: connection successfully deleted May 14 07:03:56 huautla-bsc osmo-bsc[5253]: DRLL NOTICE lchan_select.c:305 (bts=1) Failed to select SDCCH channel May 14 07:03:56 huautla-bsc osmo-bsc[5253]: DRSL NOTICE abis_rsl.c:2199 (bts=1) CHAN RQD[Location updating]: no resources for SDCCH 0xb, retrying with TCH_H May 14 07:03:56 huautla-bsc osmo-bsc[5253]: DRSL ERROR gsm_08_08.c:473 MM GSM48_MT_MM_LOC_UPD_REQUEST: TMSI-0x9B0C8433: No suitable MSC for this Complete Layer 3 request found May 14 07:03:56 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:171 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x9b0c8433)[0x5582efb32560]{WAIT_CLEAR_CMD}: Unable to deliver BSSMAP Clear Re May 14 07:03:56 huautla-bsc osmo-bsc[5253]: DMSC ERROR osmo_bsc_sigtran.c:366 MSC is not connected. Dropping. May 14 07:03:56 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:143 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x9b0c8433)[0x5582efb32560]{WAIT_SCCP_RLSD}: Unable to deliver SCCP message May 14 07:03:56 huautla-bsc osmo-bsc[5253]: DLINP ERROR input/ipaccess.c:432 Bad signalling message, sign_link returned error: No such file or directory. May 14 07:03:57 huautla-bsc osmo-bsc[5253]: DCHAN NOTICE abis_rsl.c:2947 lchan(1-0-5-TCH_H-1)[0x5582ef9dbff0]{WAIT_RF_RELEASE_ACK}: (type=TCH_H) Rx IPACC DLCX IND (cause=Normal event, unspecified [ 0f ]) May 14 07:03:57 huautla-bsc osmo-bsc[5253]: DCHAN NOTICE abis_rsl.c:2947 lchan(1-0-2-TCH_H-1)[0x5582ef9de8c0]{WAIT_RF_RELEASE_ACK}: (type=TCH_H) Rx IPACC DLCX IND (cause=Normal event, unspecified [ 0f ]) May 14 07:03:57 huautla-bsc osmo-bsc[5253]: DCHAN NOTICE abis_rsl.c:2947 lchan(1-0-2-TCH_H-0)[0x5582ef9e4710]{WAIT_RF_RELEASE_ACK}: (type=TCH_H) Rx IPACC DLCX IND (cause=Normal event, unspecified [ 0f ]) May 14 07:03:57 huautla-bsc osmo-bsc[5253]: DCHAN NOTICE abis_rsl.c:2947 lchan(1-0-4-TCH_H-0)[0x5582ef9cf050]{WAIT_RF_RELEASE_ACK}: (type=TCH_H) Rx IPACC DLCX IND (cause=Normal event, unspecified [ 0f ]) May 14 07:03:58 huautla-bsc osmo-bsc[5253]: DRSL ERROR gsm_08_08.c:473 MM GSM48_MT_MM_CM_SERV_REQ: TMSI-0x0476645D: No suitable MSC for this Complete Layer 3 request found May 14 07:03:58 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:171 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x0476645d)[0x5582efb63470]{WAIT_CLEAR_CMD}: Unable to deliver BSSMAP Clear Re May 14 07:03:58 huautla-bsc osmo-bsc[5253]: DMSC ERROR osmo_bsc_sigtran.c:366 MSC is not connected. Dropping. May 14 07:03:58 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:143 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x0476645d)[0x5582efb63470]{WAIT_SCCP_RLSD}: Unable to deliver SCCP message May 14 07:03:58 huautla-bsc osmo-bsc[5253]: DLINP ERROR input/ipaccess.c:432 Bad signalling message, sign_link returned error: No such file or directory. May 14 07:03:59 huautla-bsc osmo-bsc[5253]: DRSL ERROR gsm_08_08.c:473 MM GSM48_MT_MM_CM_SERV_REQ: TMSI-0x9472613F: No suitable MSC for this Complete Layer 3 request found May 14 07:03:59 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:171 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x9472613f)[0x5582efb32560]{WAIT_CLEAR_CMD}: Unable to deliver BSSMAP Clear Re May 14 07:03:59 huautla-bsc osmo-bsc[5253]: DMSC ERROR osmo_bsc_sigtran.c:366 MSC is not connected. Dropping. May 14 07:03:59 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:143 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x9472613f)[0x5582efb32560]{WAIT_SCCP_RLSD}: Unable to deliver SCCP message May 14 07:03:59 huautla-bsc osmo-bsc[5253]: DLINP ERROR input/ipaccess.c:432 Bad signalling message, sign_link returned error: No such file or directory. May 14 07:03:59 huautla-bsc osmo-bsc[5253]: DRSL ERROR gsm_08_08.c:473 MM GSM48_MT_MM_CM_SERV_REQ: TMSI-0x30BCC50E: No suitable MSC for this Complete Layer 3 request found May 14 07:03:59 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:171 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x30bcc50e)[0x5582efb63470]{WAIT_CLEAR_CMD}: Unable to deliver BSSMAP Clear Re May 14 07:03:59 huautla-bsc osmo-bsc[5253]: DMSC ERROR osmo_bsc_sigtran.c:366 MSC is not connected. Dropping. May 14 07:03:59 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:143 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x30bcc50e)[0x5582efb63470]{WAIT_SCCP_RLSD}: Unable to deliver SCCP message May 14 07:03:59 huautla-bsc osmo-bsc[5253]: DLINP ERROR input/ipaccess.c:432 Bad signalling message, sign_link returned error: No such file or directory. May 14 07:04:01 huautla-bsc CRON[15018]: pam_unix(cron:session): session opened for user root by (uid=0) May 14 07:04:01 huautla-bsc CRON[15019]: (root) CMD (/var/rhizomatica/bin/network_update60.sh) May 14 07:04:01 huautla-bsc CRON[15018]: pam_unix(cron:session): session closed for user root May 14 07:04:02 huautla-bsc osmo-bsc[5253]: DLSCCP ERROR sccp_scoc.c:1031 SCCP-SCOC(32895)[0x5582efa25280]{WAIT_CONN_CONF}: transition to state IDLE not permitted! May 14 07:04:02 huautla-bsc osmo-bsc[5253]: DRSL ERROR gsm_08_08.c:473 MM GSM48_MT_MM_CM_SERV_REQ: TMSI-0x0476645D: No suitable MSC for this Complete Layer 3 request found May 14 07:04:02 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:171 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x0476645d)[0x5582efb32560]{WAIT_CLEAR_CMD}: Unable to deliver BSSMAP Clear Re May 14 07:04:02 huautla-bsc osmo-bsc[5253]: DMSC ERROR osmo_bsc_sigtran.c:366 MSC is not connected. Dropping. May 14 07:04:02 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:143 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x0476645d)[0x5582efb32560]{WAIT_SCCP_RLSD}: Unable to deliver SCCP message May 14 07:04:02 huautla-bsc osmo-bsc[5253]: DLINP ERROR input/ipaccess.c:432 Bad signalling message, sign_link returned error: No such file or directory. May 14 07:04:02 huautla-bsc ssh[2713]: debug1: client_input_global_request: rtype keepalive@openssh.com want_reply 1 May 14 07:04:02 huautla-bsc osmo-bsc[5253]: DCHAN ERROR lchan_fsm.c:1801 lchan(1-0-1-SDCCH8-1)[0x5582ef9cfcd0]{WAIT_RLL_RTP_ESTABLISH}: (type=SDCCH) lchan allocation failed in state WAIT_RLL_RTP_ESTABLISH: Timeou May 14 07:04:02 huautla-bsc osmo-bsc[5253]: DRSL ERROR gsm_08_08.c:473 MM GSM48_MT_MM_CM_SERV_REQ: TMSI-0x912B8764: No suitable MSC for this Complete Layer 3 request found May 14 07:04:02 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:171 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x912b8764)[0x5582efb63470]{WAIT_CLEAR_CMD}: Unable to deliver BSSMAP Clear Re May 14 07:04:02 huautla-bsc osmo-bsc[5253]: DMSC ERROR osmo_bsc_sigtran.c:366 MSC is not connected. Dropping. May 14 07:04:02 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:143 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x912b8764)[0x5582efb63470]{WAIT_SCCP_RLSD}: Unable to deliver SCCP message May 14 07:04:02 huautla-bsc osmo-bsc[5253]: DLINP ERROR input/ipaccess.c:432 Bad signalling message, sign_link returned error: No such file or directory. May 14 07:04:04 huautla-bsc osmo-bsc[5253]: DRSL ERROR gsm_08_08.c:473 MM GSM48_MT_MM_CM_SERV_REQ: TMSI-0x3B133113: No suitable MSC for this Complete Layer 3 request found May 14 07:04:04 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:171 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x3b133113)[0x5582efb32560]{WAIT_CLEAR_CMD}: Unable to deliver BSSMAP Clear Re May 14 07:04:04 huautla-bsc osmo-bsc[5253]: DMSC ERROR osmo_bsc_sigtran.c:366 MSC is not connected. Dropping. May 14 07:04:04 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:143 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x3b133113)[0x5582efb32560]{WAIT_SCCP_RLSD}: Unable to deliver SCCP message May 14 07:04:04 huautla-bsc osmo-bsc[5253]: DLINP ERROR input/ipaccess.c:432 Bad signalling message, sign_link returned error: No such file or directory. May 14 07:04:05 huautla-bsc osmo-bsc[5253]: DRSL ERROR gsm_08_08.c:473 MM GSM48_MT_MM_LOC_UPD_REQUEST: TMSI-0x0E00702C: No suitable MSC for this Complete Layer 3 request found May 14 07:04:05 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:171 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x0e00702c)[0x5582efb63470]{WAIT_CLEAR_CMD}: Unable to deliver BSSMAP Clear Re May 14 07:04:05 huautla-bsc osmo-bsc[5253]: DMSC ERROR osmo_bsc_sigtran.c:366 MSC is not connected. Dropping. May 14 07:04:05 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:143 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x0e00702c)[0x5582efb63470]{WAIT_SCCP_RLSD}: Unable to deliver SCCP message May 14 07:04:05 huautla-bsc osmo-bsc[5253]: DLINP ERROR input/ipaccess.c:432 Bad signalling message, sign_link returned error: No such file or directory. May 14 07:04:05 huautla-bsc osmo-bsc[5253]: DRSL ERROR gsm_08_08.c:473 MM GSM48_MT_MM_CM_SERV_REQ: TMSI-0x2205655E: No suitable MSC for this Complete Layer 3 request found May 14 07:04:05 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:171 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x2205655e)[0x5582efb32560]{WAIT_CLEAR_CMD}: Unable to deliver BSSMAP Clear Re May 14 07:04:05 huautla-bsc osmo-bsc[5253]: DMSC ERROR osmo_bsc_sigtran.c:366 MSC is not connected. Dropping. May 14 07:04:05 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:143 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x2205655e)[0x5582efb32560]{WAIT_SCCP_RLSD}: Unable to deliver SCCP message May 14 07:04:05 huautla-bsc osmo-bsc[5253]: DLINP ERROR input/ipaccess.c:432 Bad signalling message, sign_link returned error: No such file or directory. May 14 07:04:05 huautla-bsc osmo-bsc[5253]: DRSL ERROR gsm_08_08.c:473 MM GSM48_MT_MM_CM_SERV_REQ: TMSI-0x9472613F: No suitable MSC for this Complete Layer 3 request found May 14 07:04:05 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:171 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x9472613f)[0x5582efb63470]{WAIT_CLEAR_CMD}: Unable to deliver BSSMAP Clear Re May 14 07:04:05 huautla-bsc osmo-bsc[5253]: DMSC ERROR osmo_bsc_sigtran.c:366 MSC is not connected. Dropping. May 14 07:04:05 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:143 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x9472613f)[0x5582efb63470]{WAIT_SCCP_RLSD}: Unable to deliver SCCP message May 14 07:04:05 huautla-bsc osmo-bsc[5253]: DLINP ERROR input/ipaccess.c:432 Bad signalling message, sign_link returned error: No such file or directory. May 14 07:04:05 huautla-bsc osmo-bsc[5253]: DRSL ERROR gsm_08_08.c:473 MM GSM48_MT_MM_CM_SERV_REQ: TMSI-0x912B8764: No suitable MSC for this Complete Layer 3 request found May 14 07:04:05 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:171 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x912b8764)[0x5582efb32560]{WAIT_CLEAR_CMD}: Unable to deliver BSSMAP Clear Re May 14 07:04:05 huautla-bsc osmo-bsc[5253]: DMSC ERROR osmo_bsc_sigtran.c:366 MSC is not connected. Dropping. May 14 07:04:05 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:143 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x912b8764)[0x5582efb32560]{WAIT_SCCP_RLSD}: Unable to deliver SCCP message May 14 07:04:05 huautla-bsc osmo-bsc[5253]: DLINP ERROR input/ipaccess.c:432 Bad signalling message, sign_link returned error: No such file or directory. May 14 07:04:06 huautla-bsc osmo-bsc[5253]: DRSL ERROR gsm_08_08.c:473 MM GSM48_MT_MM_CM_SERV_REQ: TMSI-0x30BCC50E: No suitable MSC for this Complete Layer 3 request found May 14 07:04:06 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:171 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x30bcc50e)[0x5582efb63470]{WAIT_CLEAR_CMD}: Unable to deliver BSSMAP Clear Re May 14 07:04:06 huautla-bsc osmo-bsc[5253]: DMSC ERROR osmo_bsc_sigtran.c:366 MSC is not connected. Dropping. May 14 07:04:06 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:143 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x30bcc50e)[0x5582efb63470]{WAIT_SCCP_RLSD}: Unable to deliver SCCP message May 14 07:04:06 huautla-bsc osmo-bsc[5253]: DLINP ERROR input/ipaccess.c:432 Bad signalling message, sign_link returned error: No such file or directory. May 14 07:04:07 huautla-bsc osmo-bsc[5253]: DRSL ERROR gsm_08_08.c:473 MM GSM48_MT_MM_CM_SERV_REQ: TMSI-0x1F782AE6: No suitable MSC for this Complete Layer 3 request found May 14 07:04:07 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:171 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x1f782ae6)[0x5582efb32560]{WAIT_CLEAR_CMD}: Unable to deliver BSSMAP Clear Re May 14 07:04:07 huautla-bsc osmo-bsc[5253]: DMSC ERROR osmo_bsc_sigtran.c:366 MSC is not connected. Dropping. May 14 07:04:07 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:143 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x1f782ae6)[0x5582efb32560]{WAIT_SCCP_RLSD}: Unable to deliver SCCP message May 14 07:04:07 huautla-bsc osmo-bsc[5253]: DLINP ERROR input/ipaccess.c:432 Bad signalling message, sign_link returned error: No such file or directory. May 14 07:04:07 huautla-bsc osmo-bsc[5253]: DRSL ERROR gsm_08_08.c:473 MM GSM48_MT_MM_CM_SERV_REQ: TMSI-0x0476645D: No suitable MSC for this Complete Layer 3 request found May 14 07:04:07 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:171 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x0476645d)[0x5582efb63470]{WAIT_CLEAR_CMD}: Unable to deliver BSSMAP Clear Re May 14 07:04:07 huautla-bsc osmo-bsc[5253]: DMSC ERROR osmo_bsc_sigtran.c:366 MSC is not connected. Dropping. May 14 07:04:07 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:143 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x0476645d)[0x5582efb63470]{WAIT_SCCP_RLSD}: Unable to deliver SCCP message May 14 07:04:07 huautla-bsc osmo-bsc[5253]: DLINP ERROR input/ipaccess.c:432 Bad signalling message, sign_link returned error: No such file or directory. May 14 07:04:08 huautla-bsc osmo-bsc[5253]: DRLL NOTICE lchan_select.c:305 (bts=1) Failed to select SDCCH channel May 14 07:04:08 huautla-bsc osmo-bsc[5253]: DRSL NOTICE abis_rsl.c:2199 (bts=1) CHAN RQD[other]: no resources for SDCCH 0x13, retrying with TCH_H May 14 07:04:08 huautla-bsc osmo-bsc[5253]: DRSL ERROR gsm_08_08.c:473 MM GSM48_MT_MM_CM_SERV_REQ: TMSI-0x3CAD94A2: No suitable MSC for this Complete Layer 3 request found May 14 07:04:08 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:171 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x3cad94a2)[0x5582efb32560]{WAIT_CLEAR_CMD}: Unable to deliver BSSMAP Clear Re May 14 07:04:08 huautla-bsc osmo-bsc[5253]: DMSC ERROR osmo_bsc_sigtran.c:366 MSC is not connected. Dropping. May 14 07:04:08 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:143 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x3cad94a2)[0x5582efb32560]{WAIT_SCCP_RLSD}: Unable to deliver SCCP message May 14 07:04:08 huautla-bsc osmo-bsc[5253]: DLINP ERROR input/ipaccess.c:432 Bad signalling message, sign_link returned error: No such file or directory. May 14 07:04:10 huautla-bsc osmo-bsc[5253]: DLSCCP ERROR sccp_scoc.c:1031 SCCP-SCOC(32896)[0x5582efae7f40]{WAIT_CONN_CONF}: transition to state IDLE not permitted! May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DRSL ERROR gsm_08_08.c:473 MM GSM48_MT_MM_LOC_UPD_REQUEST: TMSI-0x1CC2E34F: No suitable MSC for this Complete Layer 3 request found May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:171 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x1cc2e34f)[0x5582efb63470]{WAIT_CLEAR_CMD}: Unable to deliver BSSMAP Clear Re May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DMSC ERROR osmo_bsc_sigtran.c:366 MSC is not connected. Dropping. May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:143 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x1cc2e34f)[0x5582efb63470]{WAIT_SCCP_RLSD}: Unable to deliver SCCP message May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DLINP ERROR input/ipaccess.c:432 Bad signalling message, sign_link returned error: No such file or directory. May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DRSL ERROR gsm_08_08.c:473 MM GSM48_MT_MM_LOC_UPD_REQUEST: TMSI-0x7C863499: No suitable MSC for this Complete Layer 3 request found May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:171 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x7c863499)[0x5582efb32560]{WAIT_CLEAR_CMD}: Unable to deliver BSSMAP Clear Re May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DMSC ERROR osmo_bsc_sigtran.c:366 MSC is not connected. Dropping. May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:143 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x7c863499)[0x5582efb32560]{WAIT_SCCP_RLSD}: Unable to deliver SCCP message May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DLINP ERROR input/ipaccess.c:432 Bad signalling message, sign_link returned error: No such file or directory. May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DRSL ERROR gsm_08_08.c:473 MM GSM48_MT_MM_LOC_UPD_REQUEST: TMSI-0x18F30D88: No suitable MSC for this Complete Layer 3 request found May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:171 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x18f30d88)[0x5582efb63470]{WAIT_CLEAR_CMD}: Unable to deliver BSSMAP Clear Re May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DMSC ERROR osmo_bsc_sigtran.c:366 MSC is not connected. Dropping. May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:143 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x18f30d88)[0x5582efb63470]{WAIT_SCCP_RLSD}: Unable to deliver SCCP message May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DLINP ERROR input/ipaccess.c:432 Bad signalling message, sign_link returned error: No such file or directory. May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DRSL ERROR gsm_08_08.c:473 MM GSM48_MT_MM_LOC_UPD_REQUEST: IMSI-334020534773701: No suitable MSC for this Complete Layer 3 request found May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:171 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-IMSI-334020534773701)[0x5582efb32560]{WAIT_CLEAR_CMD}: Unable to deliver BSSMAP Cle May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DMSC ERROR osmo_bsc_sigtran.c:366 MSC is not connected. Dropping. May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:143 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-IMSI-334020534773701)[0x5582efb32560]{WAIT_SCCP_RLSD}: Unable to deliver SCCP messa May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DLINP ERROR input/ipaccess.c:432 Bad signalling message, sign_link returned error: No such file or directory. May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DRSL ERROR gsm_08_08.c:473 MM GSM48_MT_MM_LOC_UPD_REQUEST: TMSI-0x4E610EFD: No suitable MSC for this Complete Layer 3 request found May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:171 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x4e610efd)[0x5582efb63470]{WAIT_CLEAR_CMD}: Unable to deliver BSSMAP Clear Re May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DMSC ERROR osmo_bsc_sigtran.c:366 MSC is not connected. Dropping. May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:143 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x4e610efd)[0x5582efb63470]{WAIT_SCCP_RLSD}: Unable to deliver SCCP message May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DLINP ERROR input/ipaccess.c:432 Bad signalling message, sign_link returned error: No such file or directory. May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DRSL ERROR gsm_08_08.c:473 MM GSM48_MT_MM_CM_SERV_REQ: TMSI-0x9472613F: No suitable MSC for this Complete Layer 3 request found May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:171 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x9472613f)[0x5582efb32560]{WAIT_CLEAR_CMD}: Unable to deliver BSSMAP Clear Re May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DMSC ERROR osmo_bsc_sigtran.c:366 MSC is not connected. Dropping. May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:143 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x9472613f)[0x5582efb32560]{WAIT_SCCP_RLSD}: Unable to deliver SCCP message May 14 07:04:11 huautla-bsc osmo-bsc[5253]: DLINP ERROR input/ipaccess.c:432 Bad signalling message, sign_link returned error: No such file or directory. May 14 07:04:12 huautla-bsc osmo-bsc[5253]: DRSL ERROR gsm_08_08.c:473 MM GSM48_MT_MM_CM_SERV_REQ: TMSI-0x3B133113: No suitable MSC for this Complete Layer 3 request found May 14 07:04:12 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:171 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x3b133113)[0x5582efb63470]{WAIT_CLEAR_CMD}: Unable to deliver BSSMAP Clear Re May 14 07:04:12 huautla-bsc osmo-bsc[5253]: DMSC ERROR osmo_bsc_sigtran.c:366 MSC is not connected. Dropping. May 14 07:04:12 huautla-bsc osmo-bsc[5253]: DMSC ERROR bsc_subscr_conn_fsm.c:143 SUBSCR_CONN(msc4294967295-conn4294967295_subscr-TMSI-0x3b133113)[0x5582efb63470]{WAIT_SCCP_RLSD}: Unable to deliver SCCP message May 14 07:04:12 huautla-bsc osmo-bsc[5253]: DLINP ERROR input/ipaccess.c:432 Bad signalling message, sign_link returned error: No such file or directory. May 14 07:04:13 huautla-bsc osmo-bsc[5253]: DRLL NOTICE lchan_select.c:305 (bts=1) Failed to select SDCCH channel May 14 07:04:13 huautla-bsc osmo-bsc[5253]: DRSL NOTICE abis_rsl.c:2199 (bts=1) CHAN RQD[Location updating]: no resources for SDCCH 0x2, retrying with TCH_H
Updated by laforge about 2 years ago
So
May 14 07:03:55 huautla-bsc osmo-bsc[5253]: DRESET NOTICE bssmap_reset.c:91 bssmap_reset(msc-0)[0x5582ef9cdce0]{DISC}: link lost May 14 07:03:55 huautla-bsc osmo-bsc[5253]: DMSC NOTICE a_reset.c:54 (msc0) BSSMAP assocation is down
means that at 07:03:55 we had more than three consecutive SCCP CR from BSC->MSC that received no response within 20s. This means the actual lock-up must have occurred at least 20s earlier, 07:03:35. That's just after a cron job was executed and exactly at the time a DHCP renewal happened. coincidence?
Updated by laforge about 2 years ago
I can find similar time coincidences earlier in the log
May 13 23:56:20 huautla-bsc dhclient[546]: DHCPREQUEST for 192.168.1.107 on eth0 to 192.168.1.1 port 67 May 13 23:56:20 huautla-bsc dhclient[546]: DHCPACK of 192.168.1.107 from 192.168.1.1 May 13 23:56:20 huautla-bsc dnsmasq[1076]: reading /etc/resolv.conf May 13 23:56:20 huautla-bsc dnsmasq[825]: reading /etc/resolv.conf May 13 23:56:20 huautla-bsc dnsmasq[1076]: using nameserver 192.168.1.1#53 May 13 23:56:20 huautla-bsc dnsmasq[825]: using nameserver 192.168.1.1#53 May 13 23:56:20 huautla-bsc dhclient[546]: bound to 192.168.1.107 -- renewal in 266 seconds. May 13 23:56:21 huautla-bsc ssh[2713]: debug1: client_input_global_request: rtype keepalive@openssh.com want_reply 1 May 13 23:56:51 huautla-bsc ssh[2713]: debug1: client_input_global_request: rtype keepalive@openssh.com want_reply 1 May 13 23:57:01 huautla-bsc CRON[15156]: pam_unix(cron:session): session opened for user root by (uid=0) May 13 23:57:01 huautla-bsc CRON[15157]: (root) CMD (/var/rhizomatica/bin/network_update60.sh) May 13 23:57:02 huautla-bsc CRON[15156]: pam_unix(cron:session): session closed for user root May 13 23:57:21 huautla-bsc ssh[2713]: debug1: client_input_global_request: rtype keepalive@openssh.com want_reply 1 May 13 23:57:51 huautla-bsc ssh[2713]: debug1: client_input_global_request: rtype keepalive@openssh.com want_reply 1 May 13 23:58:01 huautla-bsc CRON[15207]: pam_unix(cron:session): session opened for user root by (uid=0) May 13 23:58:01 huautla-bsc CRON[15208]: (root) CMD (/var/rhizomatica/bin/network_update60.sh) May 13 23:58:01 huautla-bsc CRON[15207]: pam_unix(cron:session): session closed for user root May 13 23:58:02 huautla-bsc osmo-bsc[5253]: DRESET NOTICE bssmap_reset.c:91 bssmap_reset(msc-0)[0x5582ef9cdce0]{DISC}: link lost
May 13 23:34:46 huautla-bsc ssh[2713]: debug1: client_input_global_request: rtype keepalive@openssh.com want_reply 1 May 13 23:35:01 huautla-bsc CRON[12216]: pam_unix(cron:session): session opened for user root by (uid=0) May 13 23:35:01 huautla-bsc CRON[12217]: pam_unix(cron:session): session opened for user root by (uid=0) May 13 23:35:01 huautla-bsc CRON[12220]: (root) CMD (/var/rhizomatica/bin/network_update60.sh) May 13 23:35:01 huautla-bsc CRON[12215]: pam_unix(cron:session): session opened for user root by (uid=0) May 13 23:35:01 huautla-bsc CRON[12219]: pam_unix(cron:session): session opened for user root by (uid=0) May 13 23:35:01 huautla-bsc CRON[12214]: pam_unix(cron:session): session opened for user root by (uid=0) May 13 23:35:01 huautla-bsc CRON[12222]: (root) CMD (/var/rhizomatica/bin/platform_update_rrd.sh) May 13 23:35:01 huautla-bsc CRON[12218]: pam_unix(cron:session): session opened for user root by (uid=0) May 13 23:35:01 huautla-bsc CRON[12221]: (root) CMD (/home/rhizomatica/bin/monitor_rapi.sh > /dev/null 2>&1) May 13 23:35:01 huautla-bsc CRON[12223]: (root) CMD (/home/rhizomatica/bin/monitor_bts.sh > /dev/null 2>&1) May 13 23:35:01 huautla-bsc CRON[12224]: (root) CMD (/var/rhizomatica/bin/network_update_rrd.sh) May 13 23:35:01 huautla-bsc CRON[12225]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1) May 13 23:35:01 huautla-bsc CRON[12214]: pam_unix(cron:session): session closed for user root May 13 23:35:01 huautla-bsc CRON[12215]: pam_unix(cron:session): session closed for user root May 13 23:35:01 huautla-bsc CRON[12217]: pam_unix(cron:session): session closed for user root May 13 23:35:02 huautla-bsc CRON[12216]: pam_unix(cron:session): session closed for user root May 13 23:35:06 huautla-bsc CRON[12218]: pam_unix(cron:session): session closed for user root May 13 23:35:13 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:855 endpoint:rtpbridge/3@mgw CRCX: creating new connection ... May 13 23:35:13 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1091 endpoint:rtpbridge/3@mgw CI:CC7B5F4C CRCX: connection successfully created May 13 23:35:13 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:855 endpoint:rtpbridge/4@mgw CRCX: creating new connection ... May 13 23:35:13 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1091 endpoint:rtpbridge/4@mgw CI:5C7786CB CRCX: connection successfully created May 13 23:35:13 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1124 endpoint:rtpbridge/4@mgw MDCX: modifying existing connection ... May 13 23:35:13 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_sdp.c:436 endpoint:rtpbridge/4@mgw CI:5C7786CB Got media info via SDP: port:17106, addr:172.16.0.11, duration:20, payload-types:3=GSM May 13 23:35:13 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1312 endpoint:rtpbridge/4@mgw CI:5C7786CB MDCX: connection successfully modified May 13 23:35:14 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:855 endpoint:rtpbridge/4@mgw CRCX: creating new connection ... May 13 23:35:14 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_sdp.c:436 endpoint:rtpbridge/4@mgw CI:D995ACEF Got media info via SDP: port:7530, addr:172.16.0.1, duration:20, payload-types:3=GSM May 13 23:35:14 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1091 endpoint:rtpbridge/4@mgw CI:D995ACEF CRCX: connection successfully created May 13 23:35:14 huautla-bsc osmo-mgw[481]: DRTP mgcp_network.c:970 (rtpbridge/3@mgw I:CC7B5F4C) RTP packet too short (1 < 12) May 13 23:35:16 huautla-bsc ssh[2713]: debug1: client_input_global_request: rtype keepalive@openssh.com want_reply 1 May 13 23:35:18 huautla-bsc osmo-sip-connector[755]: Fri May 13 23:35:18 2022 DMNCC ERROR mncc.c:54 command(0x204) never arrived for leg(9495) May 13 23:35:18 huautla-bsc osmo-sip-connector[755]: Fri May 13 23:35:18 2022 DSIP ERROR sip.c:463 cause2status(): Cause(unknown 0x0) not found in map. May 13 23:35:19 huautla-bsc CRON[12219]: pam_unix(cron:session): session closed for user root May 13 23:35:28 huautla-bsc osmo-bsc[5253]: DMSC ERROR gsm_08_08.c:527 SUBSCR_CONN(msc0-conn26826_subscr-IMSI-334020504491834-TMSI-0x14e5ca1e)[0x5582efa936e0]{WAIT_CLEAR_CMD}: Event MO_COMPL_L3 not permitted May 13 23:35:28 huautla-bsc osmo-bsc[5253]: DLINP ERROR input/ipaccess.c:432 Bad signalling message, sign_link returned error: Operation not permitted. May 13 23:35:36 huautla-bsc osmo-bsc[5253]: DMSC ERROR gsm_08_08.c:527 SUBSCR_CONN(msc0-conn26826_subscr-IMSI-334020504491834-TMSI-0x14e5ca1e)[0x5582efa936e0]{WAIT_CLEAR_CMD}: Event MO_COMPL_L3 not permitted May 13 23:35:36 huautla-bsc osmo-bsc[5253]: DLINP ERROR input/ipaccess.c:432 Bad signalling message, sign_link returned error: Operation not permitted. May 13 23:35:36 huautla-bsc osmo-bsc[5253]: DRLL NOTICE lchan_select.c:305 (bts=1) Failed to select SDCCH channel May 13 23:35:36 huautla-bsc osmo-bsc[5253]: DRSL NOTICE abis_rsl.c:2199 (bts=1) CHAN RQD[other]: no resources for SDCCH 0x16, retrying with TCH_H May 13 23:35:43 huautla-bsc osmo-bsc[5253]: DRLL NOTICE lchan_select.c:305 (bts=1) Failed to select SDCCH channel May 13 23:35:43 huautla-bsc osmo-bsc[5253]: DRSL NOTICE abis_rsl.c:2199 (bts=1) CHAN RQD[other]: no resources for SDCCH 0x16, retrying with TCH_H May 13 23:35:43 huautla-bsc osmo-bsc[5253]: DMSC ERROR gsm_08_08.c:527 SUBSCR_CONN(msc0-conn26826_subscr-IMSI-334020504491834-TMSI-0x14e5ca1e)[0x5582efa936e0]{WAIT_CLEAR_CMD}: Event MO_COMPL_L3 not permitted May 13 23:35:43 huautla-bsc osmo-bsc[5253]: DLINP ERROR input/ipaccess.c:432 Bad signalling message, sign_link returned error: Operation not permitted. May 13 23:35:45 huautla-bsc osmo-bsc[5253]: DRESET NOTICE bssmap_reset.c:91 bssmap_reset(msc-0)[0x5582ef9cdce0]{DISC}: link lost
May 13 23:33:01 huautla-bsc CRON[12091]: pam_unix(cron:session): session opened for user root by (uid=0) May 13 23:33:01 huautla-bsc CRON[12092]: (root) CMD (/var/rhizomatica/bin/network_update60.sh) May 13 23:33:01 huautla-bsc CRON[12091]: pam_unix(cron:session): session closed for user root May 13 23:33:06 huautla-bsc osmo-bsc[5253]: DMSC ERROR gsm_08_08.c:527 SUBSCR_CONN(msc0-conn26733_subscr-IMSI-334020504491834-TMSI-0x14e5ca1e)[0x5582efa01730]{WAIT_CLEAR_CMD}: Event MO_COMPL_L3 not permitted May 13 23:33:06 huautla-bsc osmo-bsc[5253]: DLINP ERROR input/ipaccess.c:432 Bad signalling message, sign_link returned error: Operation not permitted. May 13 23:33:07 huautla-bsc osmo-bsc[5253]: DRESET NOTICE bssmap_reset.c:91 bssmap_reset(msc-0)[0x5582ef9cdce0]{DISC}: link lost May 13 23:33:07 huautla-bsc osmo-bsc[5253]: DMSC NOTICE a_reset.c:54 (msc0) BSSMAP assocation is down
May 13 23:07:01 huautla-bsc CRON[8905]: pam_unix(cron:session): session opened for user root by (uid=0) May 13 23:07:01 huautla-bsc CRON[8906]: (root) CMD (/var/rhizomatica/bin/network_update60.sh) May 13 23:07:01 huautla-bsc CRON[8905]: pam_unix(cron:session): session closed for user root May 13 23:07:02 huautla-bsc osmo-bsc[5253]: DMSC NOTICE gsm_08_08.c:79 Tx MSC SAPI N REJECT (dlci=0xc3, cause='BSS NOT EQUIPPED') May 13 23:07:07 huautla-bsc osmo-bsc[5253]: DLSCCP NOTICE sccp_scoc.c:1597 Received message for opc=185=0.23.1 on conn with mismatching remote pc=0=0.0.0 May 13 23:07:07 huautla-bsc osmo-bsc[5253]: DLSCCP NOTICE sccp_scoc.c:1597 Received message for opc=185=0.23.1 on conn with mismatching remote pc=0=0.0.0 May 13 23:07:07 huautla-bsc osmo-bsc[5253]: DLSCCP NOTICE sccp_scoc.c:1597 Received message for opc=185=0.23.1 on conn with mismatching remote pc=0=0.0.0 May 13 23:07:08 huautla-bsc osmo-bsc[5253]: DCHAN ERROR abis_rsl.c:1257 lchan(1-0-1-SDCCH8-6)[0x5582ef9d0090]{ESTABLISHED}: (type=SDCCH) CONNECTION FAIL (cause=Radio Link Failure [ 01 ]) May 13 23:07:09 huautla-bsc ssh[2713]: debug1: client_input_global_request: rtype keepalive@openssh.com want_reply 1 May 13 23:07:17 huautla-bsc osmo-bsc[5253]: DLSCCP NOTICE sccp_scoc.c:1597 Received message for opc=185=0.23.1 on conn with mismatching remote pc=0=0.0.0 May 13 23:07:17 huautla-bsc osmo-bsc[5253]: DLSCCP NOTICE sccp_scoc.c:1597 Received message for opc=185=0.23.1 on conn with mismatching remote pc=0=0.0.0 May 13 23:07:17 huautla-bsc osmo-bsc[5253]: DLSCCP NOTICE sccp_scoc.c:1597 Received message for opc=185=0.23.1 on conn with mismatching remote pc=0=0.0.0 May 13 23:07:26 huautla-bsc osmo-sip-connector[755]: Fri May 13 23:07:26 2022 DSIP NOTICE sip.c:511 Ending leg(0x55c516abf510) in connected state. May 13 23:07:26 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1341 endpoint:rtpbridge/9@mgw DLCX: deleting connection(s) ... May 13 23:07:26 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1458 endpoint:rtpbridge/9@mgw DLCX: connection successfully deleted May 13 23:07:26 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1341 endpoint:rtpbridge/9@mgw DLCX: deleting connection(s) ... May 13 23:07:26 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1458 endpoint:rtpbridge/9@mgw DLCX: connection successfully deleted May 13 23:07:26 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1341 endpoint:rtpbridge/a@mgw DLCX: deleting connection(s) ... May 13 23:07:26 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1458 endpoint:rtpbridge/a@mgw DLCX: connection successfully deleted May 13 23:07:26 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1341 endpoint:rtpbridge/a@mgw DLCX: deleting connection(s) ... May 13 23:07:26 huautla-bsc osmo-mgw[481]: DLMGCP mgcp_protocol.c:1458 endpoint:rtpbridge/a@mgw DLCX: connection successfully deleted May 13 23:07:28 huautla-bsc osmo-bsc[5253]: DCHAN NOTICE abis_rsl.c:2947 lchan(1-0-4-TCH_H-1)[0x5582ef9a05f0]{WAIT_RF_RELEASE_ACK}: (type=TCH_H) Rx IPACC DLCX IND (cause=Normal event, unspecified [ 0f ]) May 13 23:07:39 huautla-bsc ssh[2713]: debug1: client_input_global_request: rtype keepalive@openssh.com want_reply 1 May 13 23:07:44 huautla-bsc apcupsd[801]: Communications with UPS lost. May 13 23:07:47 huautla-bsc osmo-bsc[5253]: DRLL NOTICE lchan_select.c:305 (bts=1) Failed to select SDCCH channel May 13 23:07:47 huautla-bsc osmo-bsc[5253]: DRSL NOTICE abis_rsl.c:2199 (bts=1) CHAN RQD[Location updating]: no resources for SDCCH 0x9, retrying with TCH_H May 13 23:07:56 huautla-bsc osmo-bsc[5253]: DRLL NOTICE lchan_select.c:305 (bts=1) Failed to select SDCCH channel May 13 23:07:56 huautla-bsc osmo-bsc[5253]: DRSL NOTICE abis_rsl.c:2199 (bts=1) CHAN RQD[Location updating]: no resources for SDCCH 0x2, retrying with TCH_H May 13 23:07:59 huautla-bsc osmo-bsc[5253]: DCHAN ERROR lchan_fsm.c:1801 lchan(1-0-1-SDCCH8-1)[0x5582ef9cfcd0]{WAIT_RLL_RTP_ESTABLISH}: (type=SDCCH) lchan allocation failed in state WAIT_RLL_RTP_ESTABLISH: Timeou May 13 23:08:01 huautla-bsc osmo-bsc[5253]: DRESET NOTICE bssmap_reset.c:91 bssmap_reset(msc-0)[0x5582ef9cdce0]{DISC}: link lost May 13 23:08:01 huautla-bsc osmo-bsc[5253]: DMSC NOTICE a_reset.c:54 (msc0) BSSMAP assocation is down
so there always is a cron job for /var/rhizomatica/bin/network_update60.sh
running right before the link is lost...
Updated by laforge about 2 years ago
- Related to Bug #5564: blocking database I/O by SMS database added
Updated by keith about 2 years ago
- Related to Bug #5559: OsmoMSC at 100% CPU and unresponsive for up to several minutes! added
Updated by laforge about 2 years ago
- Debian unstable instead of 10 (different kernel/libc/libsqlite3)
- all builds with
--enable-asan
and hence much slower than normal builds
I've created a TC_sms_load
TTCN3 test which simulates 16 concurrent subscribers (by 16 ConnHdlr components), each performing LU + MO-SMS. This is executed in a loop. The recipients are intentionally all offline, resulting in all SMS to be stored in the database.
When using osmo-msc 2db1966e25bbecb4dfb275f88e03aacace7863d0 (before switching sqlite3 to single-thread operation), I can send about 70 SMS before the code in the test suite interacting with the VTY times out (2s timeout). I can also see in the bpftrace output:
@dbi_query: [8, 16) 9 | | [16, 32) 40 |@@@ | [32, 64) 629 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [64, 128) 101 |@@@@@@@@ | [128, 256) 16 |@ | [256, 512) 5 | | [512, 1K) 1 | | [1K, 2K) 0 | | [2K, 4K) 1 | | @sqlite3_step: [0] 336853 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [1] 16 | | [2, 4) 2 | | [4, 8) 5 | | [8, 16) 33 | | [16, 32) 34 | | [32, 64) 14 | | [64, 128) 3 | | @sms_submit_pending: [256, 512) 1 |@@@@@@ | [512, 1K) 1 |@@@@@@ | [1K, 2K) 3 |@@@@@@@@@@@@@@@@@@@ | [2K, 4K) 8 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| @smsq_take_next_sms: [256, 512) 1 |@@@@@@ | [512, 1K) 1 |@@@@@@ | [1K, 2K) 3 |@@@@@@@@@@@@@@@@@@@ | [2K, 4K) 8 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
so indeed, we are exceeding 2s several times in the sms_submit_pending / smsq_take_next_sms
When using current master (63494a6bef1a54f87742e83d9ad18f64d2dae2af) with sqlite3 in single-theraded mode, I can see similar results. After 82 SMS the timeout is hit:
@sms_submit_pending: [512, 1K) 2 |@@@@@@@@@@@@@ | [1K, 2K) 4 |@@@@@@@@@@@@@@@@@@@@@@@@@@ | [2K, 4K) 8 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [4K, 8K) 2 |@@@@@@@@@@@@@ | [8K, 16K) 1 |@@@@@@ | @sms_submit_pending_start: 3793923232666688 @smsq_take_next_sms: [512, 1K) 2 |@@@@@@@@@@@@@ | [1K, 2K) 4 |@@@@@@@@@@@@@@@@@@@@@@@@@@ | [2K, 4K) 8 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [4K, 8K) 2 |@@@@@@@@@@@@@ | [8K, 16K) 1 |@@@@@@ | @sqlite3_step: [0] 529179 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [1] 20 | | [2, 4) 4 | | [4, 8) 3 | | [8, 16) 55 | | [16, 32) 45 | | [32, 64) 3 | | [64, 128) 1 | | [128, 256) 1 | |
When switching to my current laforge/sqlite3
branch using libsqlite3 directly and prepared statements, I can insert more than 1600 SMS successfully without running into any timeouts:
@sqlite3_step: [0] 18337 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [1] 14812 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [2, 4) 3110 |@@@@@@@@ | [4, 8) 1488 |@@@@ | [8, 16) 997 |@@ | [16, 32) 342 | | [32, 64) 44 | | [64, 128) 4 | | [128, 256) 1 | | [256, 512) 1 | | @sms_submit_pending: [64, 128) 28 |@@@@@@@@ | [128, 256) 168 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [256, 512) 143 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [512, 1K) 18 |@@@@@ | @smsq_take_next_sms: [64, 128) 28 |@@@@@@@@ | [128, 256) 169 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [256, 512) 142 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [512, 1K) 18 |@@@@@ |
So as we can see, we occasionally exceeded 512ms, but never reached 1s.
When looking at the osmo-msc log concurrently with the bpftrace output, it is obvious that the culprit is db_sms_get_next_unsent_rr_msisdn()
which is called up to 100 times per smsq_take_next_sms()
. In the old code (libdbi), we are not preparing (pre-compiling) the query, so we not only have 100 executions, but also 100 compilations of that very same SQL statement.
Updated by laforge about 2 years ago
- % Done changed from 0 to 50
So, with 1600 MO-SMS for currently detached subscribers getting fed into osmo-msc, and then later on delivered to their respective recipients, I'm getting this plot on current laforge/sqlite3
(a37c1d3c4977e7a01b27ee109958e4bfb05ea763):
@sqlite3_step: [0] 7096 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [1] 3801 |@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [2, 4) 2110 |@@@@@@@@@@@@@@@ | [4, 8) 2731 |@@@@@@@@@@@@@@@@@@@@ | [8, 16) 2238 |@@@@@@@@@@@@@@@@ | [16, 32) 837 |@@@@@@ | [32, 64) 85 | | [64, 128) 8 | | [128, 256) 5 | | [256, 512) 1 | | @sms_submit_pending: [4, 8) 5 |@@ | [8, 16) 43 |@@@@@@@@@@@@@@@@@@@@@ | [16, 32) 72 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [32, 64) 102 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [64, 128) 60 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [128, 256) 64 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [256, 512) 36 |@@@@@@@@@@@@@@@@@@ | [512, 1K) 2 |@ | @smsq_take_next_sms: [0] 34 |@@@@ | [1] 366 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [2, 4) 297 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [4, 8) 139 |@@@@@@@@@@@@@@@@@@@ | [8, 16) 91 |@@@@@@@@@@@@ | [16, 32) 30 |@@@@ | [32, 64) 5 | | [64, 128) 7 | | [128, 256) 31 |@@@@ | [256, 512) 30 |@@@@ | [512, 1K) 2 | |
So as we can see, all latencies less than 1s. This looks much better.
Updated by laforge about 2 years ago
- Related to Feature #5566: avoid using synchronous = FULL added
Updated by laforge almost 2 years ago
- Status changed from In Progress to Stalled