Project

General

Profile

Bug #4375

RLCMACTest.cpp fails when changing logging levels

Added by pespin 26 days ago. Updated 6 days ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
Start date:
01/23/2020
Due date:
% Done:

100%

Spec Reference:

Description

I was debugging other issues in a new test I'm adding, and I dropped the empty .err file and increased debug level of DCSN1 to debug the issue, but then existing tests fail:

Changes to reproduce the issue:

diff --git a/tests/rlcmac/RLCMACTest.cpp b/tests/rlcmac/RLCMACTest.cpp
index 61026ed..673bbed 100644
--- a/tests/rlcmac/RLCMACTest.cpp
+++ b/tests/rlcmac/RLCMACTest.cpp
@@ -238,7 +238,7 @@ int main(int argc, char *argv[])
 {
        void *ctx = talloc_named_const(NULL, 1, "RLCMACTest");
        osmo_init_logging2(ctx, &gprs_log_info);
-       log_parse_category_mask(osmo_stderr_target, "DPCU,3:DLGLOBAL,1:");
+               log_parse_category_mask(osmo_stderr_target, "DPCU,3:DLGLOBAL,1:DCSN1,2:");

        //printSizeofRLCMAC();
        testRlcMacDownlink(ctx);
diff --git a/tests/testsuite.at b/tests/testsuite.at
index 8a319bd..fe87ff9 100644
--- a/tests/testsuite.at
+++ b/tests/testsuite.at
@@ -5,8 +5,7 @@ AT_BANNER([Regression tests])
 AT_SETUP([rlcmac])
 AT_KEYWORDS([rlcmac])
 cat $abs_srcdir/rlcmac/RLCMACTest.ok > expout
-cat $abs_srcdir/rlcmac/RLCMACTest.err > experr
-AT_CHECK([$OSMO_QEMU $abs_top_builddir/tests/rlcmac/RLCMACTest], [0], [expout], [experr])
+AT_CHECK([$OSMO_QEMU $abs_top_builddir/tests/rlcmac/RLCMACTest], [0], [expout], [ignore])
 AT_CLEANUP

 AT_SETUP([multi_slot])

Test output failing:

 =========Start ENCODE=============
 +++++++++Finish ENCODE+++++++++++
 vector1 = 40284f00010981c826f446809dcecb2b2b2b2b2b
-vector2 = 40284f00010981c826f446809dcecb2b2b2b2b2b
-vector1 == vector2 : TRUE
+vector2 = 40284f00010982194de88d13b9dab2b2b2b2b2b
+vector1 == vector2 : FALSE
 vector1 = 40243f2f000087b042b2b2b2b2b2b2b2b2b2b2b
 =========Start DECODE===========
 +++++++++Finish DECODE++++++++++

So it can be seen there's a missing 2b (or/and len is 1 byte less). Which means the change in logging makes the program behave differently, which shouldn't happen.

History

#1 Updated by pespin 26 days ago

The issue appears during decode time, in

decode_gsm_rlcmac_downlink() of 40284f0000001009810c826f4406809dcecb2b2b2b2b2b (Packet Uplink Assignment (EGPRS)).

Decoding issue may be related to #1640, but in any case it shouldn't change its output based on log level....

#2 Updated by pespin 26 days ago

I attach gdb output of the decoded structure for both current test (old.txt) as well as for test with the logging level changed (new.txt).

Unfortunately, gdb reports quite a lot of differences in the decoded structure....

How I got the output:

b RLCMACTest.cpp:161
-----
set max-value-size unlimited
set print array on
set print pretty on
set pagination off
set logging on
set logging file /tmp/orig.txt
print data

#3 Updated by pespin 26 days ago

#4 Updated by pespin 26 days ago

#5 Updated by pespin 15 days ago

The logging output (failing situation) of the packet which has its encoding/decoding situation changed by logging enabled is:

 =========Start ENCODE=============
 +++++++++Finish ENCODE (0)+++++++++++
 vector1 = 40 28 4f 00 00 00 10 09 81 0c 82 6f 44 06 80 9d ce cb 2b 2b 2b 2b 2b
-vector2 = 40 28 4f 00 00 00 10 09 81 0c 82 6f 44 06 80 9d ce cb 2b 2b 2b 2b 2b
-vector1 == vector2 : TRUE
+vector2 = 40 28 4f 00 00 00 10 09 82 19 04 de 88 0d 01 3b 9d ab 2b 2b 2b 2b 2b
+vector1 == vector2 : FALSE
 vector1 = 4024030f2f0000000087b0042b2b2b2b2b2b2b2b2b2b2b
 =========Start DECODE===========
 +++++++++Finish DECODE (0)++++++++++
<0000> /git/osmo-pcu/src/gsm_rlcmac.cpp:5310 csnStreamEncoder (type=10): 
| MESSAGE_TYPE = 10 
| PAGE_MODE = 0 
| Exist_PERSISTENCE_LEVEL = 0 
|  : ID 
| u.TLLI = 2 
| u.TLLI = 2013265920 
|  : End ID 
| u.PUA_EGPRS_Struct = 1 
|  : u.PUA_EGPRS_Struct 
| u.PUA_EGPRS_00 = 0 
|  : u.PUA_EGPRS_00 
| Exist_CONTENTION_RESOLUTION_TLLI = 0 
| Exist_COMPACT_ReducedMA = 0 
| EGPRS_CHANNEL_CODING_COMMAND = 0 
| RESEGMENT = 1 
| EGPRS_WindowSize = 6 
| AccessTechnologyType = 0 
| ARAC_RETRANSMISSION_REQUEST = 0 
| TLLI_BLOCK_CHANNEL_CODING = 0 
| Exist_BEP_PERIOD2 = 0 
|  : PacketTimingAdvance 
| Exist_TIMING_ADVANCE_VALUE = 1 
| TIMING_ADVANCE_VALUE = 3 
| Exist_IndexAndtimeSlot = 0 
|  : End PacketTimingAdvance 
| Exist_Packet_Extended_Timing_Advance = 0 
| Exist_Frequency_Parameters = 1 
|  : Frequency_Parameters 
| TSC = 0 | u.ARFCN = 0 
| u.ARFCN = 623 
|  : End Frequency_Parameters 
| u.Dynamic_Allocation = 1 
|  : u.Dynamic_Allocation 
| Extended_Dynamic_Allocation = 0 
| Exist_P0 = 0 
| USF_GRANULARITY = 0 
| Exist_UPLINK_TFI_ASSIGNMENT = 1 
| UPLINK_TFI_ASSIGNMENT = 0 
| Exist_RLC_DATA_BLOCKS_GRANTED = 0 
| Exist_TBF_Starting_Time = 0 
| u.Timeslot_Allocation_Power_Ctrl_Param = 1 
|  : u.Timeslot_Allocation_Power_Ctrl_Param 
| ALPHA = 10 
| Slot[0].Exist = 0 
| Slot[1].Exist = 0 
| Slot[2].Exist = 0 
| Slot[3].Exist = 0 
| Slot[4].Exist = 0 
| Slot[5].Exist = 0 
| Slot[6].Exist = 1 
| Slot[6].USF_TN = 1 
| Slot[6].GAMMA_TN = 27 
| Slot[7].Exist = 1 
| Slot[7].USF_TN = 1 
| Slot[7].GAMMA_TN = 27 
|  : End u.Timeslot_Allocation_Power_Ctrl_Param 
|  : End u.Dynamic_Allocation 
|  : End u.PUA_EGPRS_00 
|  : End u.PUA_EGPRS_Struct 
| Padding = 43|43|43|43|43|43|

#6 Updated by pespin 15 days ago

Interestingly the issue doesn't happen if I enable the category with another level (only error).

#7 Updated by pespin 15 days ago

I used gcc's -finstrument-functions feature to print the functions being entered and the sequence of functions seems to be the same both for decoding and encoding independently of whether logging is enabled or not (what's making the result change).

#8 Updated by fixeria 14 days ago

  • % Done changed from 0 to 40

I did I quick investigation, and these logging statements seem to cause the different results:

@@ -1153,13 +1153,13 @@ csnStreamDecoder(csnStream_t* ar, const CSN_DESCR* pDescr, bitvec *vector, unsig
             guint bits_to_handle = remaining_bits_len%8;
             if (bits_to_handle > 0)
             {
-             LOGPC(DCSN1, LOGL_NOTICE, "%" PRIu64 "|", bitvec_read_field(vector, &readIndex, bits_to_handle));
+             LOGPC(DCSN1, LOGL_NOTICE, "%" PRIu64 "|", bitvec_read_field(vector, &readIndex, bits_to_handle)); // FIXME
               remaining_bits_len -= bits_to_handle;
               bit_offset += bits_to_handle;
             }
             else if (bits_to_handle == 0)
             {
-             LOGPC(DCSN1, LOGL_NOTICE, "%" PRIu64 "|", bitvec_read_field(vector, &readIndex, 8));
+             LOGPC(DCSN1, LOGL_NOTICE, "%" PRIu64 "|", bitvec_read_field(vector, &readIndex, 8)); // FIXME
               remaining_bits_len -= 8;
               bit_offset += 8;
             }
@@ -1247,7 +1247,7 @@ csnStreamDecoder(csnStream_t* ar, const CSN_DESCR* pDescr, bitvec *vector, unsig
           bit_offset += no_of_bits;
         }

-       LOGPC(DCSN1, LOGL_NOTICE, "%s = %u | ", pDescr->sz , (unsigned)bitvec_read_field(vector, &readIndex, 1));
+       LOGPC(DCSN1, LOGL_NOTICE, "%s = %u | ", pDescr->sz , (unsigned)bitvec_read_field(vector, &readIndex, 1)); // FIXME
         /* existNextElement() returned FALSE, 1 bit consumed */
         bit_offset++;
         remaining_bits_len--;
@@ -1300,7 +1300,7 @@ csnStreamDecoder(csnStream_t* ar, const CSN_DESCR* pDescr, bitvec *vector, unsig
           }
         }

-       LOGPC(DCSN1, LOGL_NOTICE, "%s = %u | ", pDescr->sz , (unsigned)bitvec_read_field(vector, &readIndex, 1));
+       LOGPC(DCSN1, LOGL_NOTICE, "%s = %u | ", pDescr->sz , (unsigned)bitvec_read_field(vector, &readIndex, 1)); // FIXME

         /* existNextElement() returned FALSE, 1 bit consumed */
         bit_offset++;

By definition bitvec_read_field() moves the internal pointer (current bit position) of a given vector, and increments readIndex by a given amount of bits. The problem is that LOGPC would not evaluate its format string if the logging message is not going to be printed, e.g. if a given logging level is lower than the current one, or in case if logging is not enabled at all.

The first two conditional calls to bitvec_read_field() are related to CSN_PADDING_BITS, so that's not critical because padding is always in the end of messages. The later two are related to CSN_RECURSIVE_ARRAY and CSN_RECURSIVE_TARRAY respectively.

#9 Updated by pespin 14 days ago

Hi fixeria , thanks for spending some time into this too!

Something you forgot to mention (or perhaps you didn't check it at all) is whether the "expected" or "correct" result happens when calling bitvec_read_field() (hence advancing the readIndex counter), or if the correct behavior is to not advance the index there.

At first glance, it looks like the expected behavior is to always increment the readIndex through bitvec_read_field(), since for instance I see same amount modified on other counters, but we really need to check if that's already being done previously or afterwards somewhere else:

               LOGPC(DCSN1, LOGL_NOTICE, "%" PRIu64 "|", bitvec_read_field(vector, &readIndex, 8));
               remaining_bits_len -= 8;
               bit_offset += 8;

#10 Updated by fixeria 14 days ago

  • Status changed from New to Feedback
  • % Done changed from 40 to 100

[...] or if the correct behavior is to not advance the index there.

That's the correct behavior. Please see https://gerrit.osmocom.org/c/osmo-pcu/+/17091.

#11 Updated by fixeria 6 days ago

  • Status changed from Feedback to Resolved
  • Assignee changed from pespin to fixeria

Merged, "works for me" (C). Closing.

Also available in: Atom PDF

Add picture from clipboard (Maximum size: 48.8 MB)