v1.0.0
PySR v1.0.0 Release Notes
PySR 1.0.0 introduces new features for imposing specific functional forms and finding parametric expressions. It also includes TensorBoard support, along with significant updates to the core algorithm, including some important bug fixes. The default hyperparameters have also been updated based on extensive tuning, with a maxsize of 30 rather than 20.
Major New Features
Expression Specifications
PySR 1.0.0 introduces new ways to specify the structure of equations through "Expression Specifications", that expose the new backend feature of AbstractExpression
:
Template Expressions
TemplateExpressionSpec
allows you to define a specific structure for your equations. For example:
expression_spec = TemplateExpressionSpec(["f", "g"], "((; f, g), (x1, x2, x3)) -> sin(f(x1, x2)) + g(x3)")
Parametric Expressions
ParametricExpressionSpec
enables fitting expressions that can adapt to different categories of data with per-category parameters:
expression_spec = ParametricExpressionSpec(max_parameters=2)
model = PySRRegressor(
expression_spec=expression_spec
binary_operators=["+", "*", "-", "/"],
)
model.fit(X, y, category=category) # Pass category labels
Improved Logging with TensorBoard
The new TensorBoardLoggerSpec
enables logging of the search process, as well as hyperparameter recording, which exposes the AbstractSRLogger
feature of the backend:
logger_spec = TensorBoardLoggerSpec(
log_dir="logs/run",
log_interval=10, # Log every 10 iterations
)
model = PySRRegressor(logger_spec=logger_spec)
Features logged include:
- Loss curves over time at each complexity level
- Population statistics
- Pareto "volume" logging (measures performance over all complexities with a single scalar)
- The min loss over time
Algorithm Improvements
Updated Default Parameters
The default hyperparameters have been significantly revised based on testing:
- Increased default
maxsize
from 20 to 30, as I noticed that many people use the defaults, and this maxsize would allow for more accurate expressions. - New mutation operator weights optimized for better performance, along the new mutation "rotate tree."
- Improved search parameters tuned using Pareto front volume calculations.
- Default
niterations
increased from 40 to 100, also to support better accuracy (at the expense of slightly longer default search times).
Core Changes
- New output organization: Results are now stored in
outputs/<run_id>/
rather than in the directory of execution. - Improved performance with better parallelism handling
- Support for Python 3.10+
- Updated Julia backend to version 1.10+
- Fix for aliasing issues in crossover operations
Breaking Changes
- Minimum Python version is now 3.10, and minimum Julia version is 1.10
- Output file structure has changed to use directories
- Parameter name updates:
equation_file
→output_directory
+run_id
- Added clearer naming for parallelism options, such as
parallelism="serial"
rather than the oldmultithreading=False, procs=0
which was unclear
Documentation
The documentation has a new home at https://ai.damtp.cam.ac.uk/pysr/