Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request]: Subset Statistics By Subpopulation/Date in gempyor.statistics.Statistic #483

Open
TimothyWillard opened this issue Jan 23, 2025 · 0 comments
Labels
config Relating to configuration files or their framework. enhancement Request for improvement or addition of new feature(s). gempyor Concerns the Python core. inference Concerns the parameter inference framework. medium priority Medium priority. quick issue Short or easy fix.

Comments

@TimothyWillard
Copy link
Contributor

Label

enhancement, gempyor, inference, config, quick issue

Priority Label

medium priority

Is your feature request related to a problem? Please describe.

In some cases ground truth data may be missing, such as not reported, but would like to fit to said data anyways. An example would be maybe data is missing for some date range or is missing for a particular subpopulation.

See GH-481 for an example of NaNs in the ground truth that could've been ignored.

Is your feature request related to a new application, scenario round, pathogen? Please describe.

No response

Describe the solution you'd like

The addition of new options to a statistic configuration for subsetting by date/subpopulation. An example would be:

death_incidence:
  name: sum_death_incidence
  sim_var: incidD
  data_var: deaths
  resample:
    aggregator: sum
    freq: W-SAT
    skipna: False
  zero_to_one: True
  likelihood:
    dist: pois
  subpop:
    - "37000"
    - "06000"
  period_start_date: 2024-01-01
  period_end_date: 2025-01-01

But this style of configuration assumes that if you want to make modifications to a statistic it's that data is limited to some period, not that it is missing from some period so maybe some thought should be given to how to best specify "all dates except December 2023" or similar.

In practice I think we could replace the corresponding entries in likelihood with 0s as that will not change the log-likelihood optimization.

@TimothyWillard TimothyWillard added config Relating to configuration files or their framework. enhancement Request for improvement or addition of new feature(s). gempyor Concerns the Python core. inference Concerns the parameter inference framework. medium priority Medium priority. quick issue Short or easy fix. labels Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
config Relating to configuration files or their framework. enhancement Request for improvement or addition of new feature(s). gempyor Concerns the Python core. inference Concerns the parameter inference framework. medium priority Medium priority. quick issue Short or easy fix.
Projects
None yet
Development

No branches or pull requests

1 participant