Develop and measure network uptime statistic #5666

kantai · 2025-01-07T16:59:35Z

While there's many ways we currently monitor different aspects of network uptime, none of these are simple statistics which we can use to track improvements (or regressions) in observed uptime or responsiveness of the network.

We should measure the following:

The amount of time between when a Stacks transfer hits the mempool and it gets included in a block.

Over a period of time (e.g., a month), this would define a distribution, and we could track progress of things like "Over a month, the 99th percentile transfer is included within X seconds." By changing from "stacks transfer" to "contract calls", we would also switch what the statistic is measuring: periods when stacks transfers do not get included indicate that the network is stalled (i.e., a total downtime in the network), periods when contract calls do not get included indicate that the network is frequently reaching a throughput bottleneck (transfer failures could also indicate this, but periods when the block budget is so tight on transfers are exceedingly rare).

This measurement would have some caveats that we could probably already predict:

The presence of a very competitive fee market could make this statistic either misleading, or require additional filtering.
The mempool is not uniformly distributed, so the measurement is sensitive to the collecting node.
Downtime that makes broadcasting a transaction impossible is not counted.

I think that, for now atleast, we start with just collecting the simplest versions of this data, and see what the data is showing us. If its useful, then great, but if some of those caveats (or others) are issues already, we can apply some heuristics or transformations to deal with them.

I don't have a clear plan for collecting this data, yet. I am pretty sure that the Stacks API database could provide this data relatively easily. It's also possible the tx tracking feature of the stacks node could provide it as well.

github-project-automation bot added this to Stacks Core Eng Jan 7, 2025

github-project-automation bot moved this to Status: 🆕 New in Stacks Core Eng Jan 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Develop and measure network uptime statistic #5666

Develop and measure network uptime statistic #5666

kantai commented Jan 7, 2025

Develop and measure network uptime statistic #5666

Develop and measure network uptime statistic #5666

Comments

kantai commented Jan 7, 2025