Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple changes to train.py #33

Open
5 tasks
zmek opened this issue Oct 25, 2024 · 1 comment
Open
5 tasks

Multiple changes to train.py #33

zmek opened this issue Oct 25, 2024 · 1 comment

Comments

@zmek
Copy link
Owner

zmek commented Oct 25, 2024

What needs to be done in short-term?

  • 1. Add AUPRC metrics to model_metadata
  • 2. Change MODEL__ED_ADMISSIONS__NAME to model_name
  • 3. Create a train module, with submodules admissions (allow for submodule for discharges later)
  • 4. Save class balance
  • 5. Create a dedicated script for training

Note about other things we have discussed

create two scripts (one for ed-predictor to use, one for public data) that call this one, and pass in these parameters (OR turn main() into a function that is called by these external scripts:

  • visits dataframe
  • yet-to-arrive dataframe
  • prediction_times - a list of tuples
  • start_training_set
  • start_validation_set
  • start_test_set
  • end_test_set
  • grid
  • ordinal_mappings
  • exclude_from_training_data
  • special_category_objects

And these are returned to the calling scripts

  • 7 trained models (5 admissions models as a list, one specialty, one yet-to-arrive)
  • model_metadata

The calling script would

  • load the data
  • apply training/test splits based on dates passed [possibly, although I've worked in some careful logic that makes sure any mrn is not included in more than one set, so I may retain the current approach, which is to pre-label each visit and then check at training stage that the modelling dates match the labelling in the dataset]
  • retrieve special_category_objects depending on whether its uclh or other
  • specify variables to be exclude_from_training_data (some columns need to be passed that are then removed, I believe, but I will check this)
  • save models and metadata
@zmek
Copy link
Owner Author

zmek commented Jan 23, 2025

  • 1. Add AUPRC metrics to model_metadata
  • 2. Change MODEL__ED_ADMISSIONS__NAME to model_name
  • 3. Create a train module, with submodules admissions (allow for submodule for discharges later)
  • 4. Save class balance
  • 5. Create a dedicated script for training
  1. has been done
  2. done in commit afe908e

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant