osmo_stat_item_get_next doesn't properly count skipped values
src/stat_item.c:osmo_stat_item_get_next() is supposed to return the amount of skipped values (see comment describing the function). But this is broken if there is more than one osmo_stat_item involved.In detail:
- stats are tracked for a list of osmo_stat_item_group, of which each has a list of osmo_stat_item, of which each has an array of values.
- The logic to find skipped entries is based on the global current_stat_item_index in stats.c and global_value_id in stats_item.c. Both are shared for all values, items, groups.
- Whenever a new value is added to any of the osmo_stat_item's values, global_value_id gets increased by one and the id gets assigned to that value.
- current_stat_item_index stores the last ID that was reported to all attached reporters (e.g. the log reporter, or a statsd instance).
- In osmo_stat_item_get_next, the diff from current_stat_item_index to the IDs of the first value that is lower than current_stat_item_index, is assumed to be the number of skipped stat values. But if there is more than one osmo_stat_item, the ID may increase by more than one between the values.
I'm working on a fix.
stat_item: make next_id argument name consistent
Let osmo_stat_item_get_next, osmo_stat_item_discard,
osmo_stat_item_discard_all consistently refer to their next_id arg as
such (and not idx or next_idx). It refers to an ID (item->values[i].id),
not an index (item->values[i]), and it is always the next one, never the
Do the same change for index/_idx variables in stats.c, which are used
as arguments to these functions. Replace rd with next_id_ in
stat_item: add comment with struct overview
stat_item: make value ids item specific
Fix counting of values missed because of FIFO overflow in
osmo_stat_item_get_next(), by assigning a new item value id effectively
as item->value[n + 1].id = item->value[n].id + 1, instead of increasing
a global_value_id that is shared between all items and groups. With
global_value_id, the count of values missed was wrong for one item, as
soon as a new value was added to another item.
This partially reverts b27b352e ("stats: Use a global index for stat
item values") from 2015, right after stats was added to libosmocore. It
was supposed to make multiple readers (reporters) possible, which could
read independently from stat_item (and later added comments explain it
like that). But this remained unused, stats has implemented multiple
reporters by reading all stat_items once and sending the same data to
all enabled reporters. The patch caused last_value_index in struct
osmo_stat_item to always remain at -1.
Replace this unused last_value_index with stats_next_id, so stats can
store the item-specific next_id in the struct again. It appears that
stats is the only direct user of osmo_stat_item, but if there are
others, they can bring their own item-specific next_id: functions in
stat_item.c still accept a next_id argument.
stats_test: restore stat_item_get_next asserts
This is a partial revert of b27b352e ("stats: Use a global index for
stat item values"). Now that osmo_stat_item_get_next correctly returns
how many values have been skipped, we can use the accurate asserts on
its return value again.
Fix the initial values of next_id_a,b (1 instead of 0), so we don't get
a skipped value on the first read. This is needed, because b27b352e
refactored osmo_stat_item_get_next to have the next id as parameter
instead of the last read one, and the initial value was not adjusted in