Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop and measure network uptime statistic #5666

Open
kantai opened this issue Jan 7, 2025 · 0 comments
Open

Develop and measure network uptime statistic #5666

kantai opened this issue Jan 7, 2025 · 0 comments

Comments

@kantai
Copy link
Member

kantai commented Jan 7, 2025

While there's many ways we currently monitor different aspects of network uptime, none of these are simple statistics which we can use to track improvements (or regressions) in observed uptime or responsiveness of the network.

We should measure the following:

The amount of time between when a Stacks transfer hits the mempool and it gets included in a block.

Over a period of time (e.g., a month), this would define a distribution, and we could track progress of things like "Over a month, the 99th percentile transfer is included within X seconds." By changing from "stacks transfer" to "contract calls", we would also switch what the statistic is measuring: periods when stacks transfers do not get included indicate that the network is stalled (i.e., a total downtime in the network), periods when contract calls do not get included indicate that the network is frequently reaching a throughput bottleneck (transfer failures could also indicate this, but periods when the block budget is so tight on transfers are exceedingly rare).

This measurement would have some caveats that we could probably already predict:

  • The presence of a very competitive fee market could make this statistic either misleading, or require additional filtering.
  • The mempool is not uniformly distributed, so the measurement is sensitive to the collecting node.
  • Downtime that makes broadcasting a transaction impossible is not counted.

I think that, for now atleast, we start with just collecting the simplest versions of this data, and see what the data is showing us. If its useful, then great, but if some of those caveats (or others) are issues already, we can apply some heuristics or transformations to deal with them.

I don't have a clear plan for collecting this data, yet. I am pretty sure that the Stacks API database could provide this data relatively easily. It's also possible the tx tracking feature of the stacks node could provide it as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Status: 🆕 New
Development

No branches or pull requests

1 participant