{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Generic least-squares function\n",
    "\n",
    "This tutorial shows how to write a basic generic least-squares cost function that works well with iminuit.\n",
    "\n",
    "We have seen in the basic tutorial how to make a least-squares function with an explicit signature that iminuit could read to find the parameter names automatically. Part of the structure of a least-squares function is always the same. What changes is the model that predicts the y-values and its parameters. Here, we show how to make a generic [weighted least-squares](https://en.wikipedia.org/wiki/Weighted_least_squares) that extracts the parameters from the model which the user provides.\n",
    "\n",
    "Note: **Cost functions for common use-cases can be imported from `iminuit.cost`**, including a better version of a generic least-squares function that we build here. The built-in cost functions come with extra features and use some insights to make them work better than naive implementations, so prefer the built-in cost functions if you can."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Derive parameters from model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from iminuit import Minuit\n",
    "from iminuit.util import describe"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class LeastSquares:\n",
    "    \"\"\"\n",
    "    Generic least-squares cost function with error.\n",
    "    \"\"\"\n",
    "\n",
    "    errordef = Minuit.LEAST_SQUARES # for Minuit to compute errors correctly\n",
    "    \n",
    "    def __init__(self, model, x, y, err):\n",
    "        self.model = model  # model predicts y for given x\n",
    "        self.x = np.asarray(x)\n",
    "        self.y = np.asarray(y)\n",
    "        self.err = np.asarray(err)\n",
    "\n",
    "    def __call__(self, *par):  # we accept a variable number of model parameters\n",
    "        ym = self.model(self.x, *par)\n",
    "        return np.sum((self.y - ym) ** 2 / self.err ** 2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's try it out with iminuit."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "def line(x, a, b):  # simple straight line model with explicit parameters\n",
    "    return a + b * x\n",
    "\n",
    "rng = np.random.default_rng(1)\n",
    "x_data = np.arange(1, 6, dtype=float)\n",
    "y_err = np.ones_like(x_data)\n",
    "y_data = line(x_data, 1, 2) + rng.normal(0, y_err)\n",
    "\n",
    "lsq = LeastSquares(line, x_data, y_data, y_err)\n",
    "\n",
    "# this fails\n",
    "try:\n",
    "    m = Minuit(lsq, a=0, b=0)\n",
    "    m.errordef=Minuit.LEAST_SQUARES\n",
    "except:\n",
    "    import traceback\n",
    "    traceback.print_exc()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "What happened? iminuit uses introspection to detect the parameter names and the number of parameters. It uses the  `describe` utility for that, but it fails, since the generic method signature `LeastSquares.__call__(self, *par)`, does not reveal the number and names of the parameters.\n",
    "\n",
    "The information could be extracted from the model signature, but iminuit knows nothing about the signature of `line(x, a, b)` here. We can fix this by generating a function signature for the `LeastSquares` class from the signature of the model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# get the args from line and strip 'x'\n",
    "describe(line)[1:]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from iminuit.util import make_func_code\n",
    "\n",
    "# now inject that into the lsq object with the make_func_code tool\n",
    "lsq.func_code = make_func_code(describe(line)[1:])\n",
    "\n",
    "# now we get the right answer\n",
    "describe(lsq)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can put this code into the init function of our generic least-squares class to obtain a generic least-squares class which works with iminuit."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class BetterLeastSquares(LeastSquares):\n",
    "    def __init__(self, model, x, y, err):\n",
    "        super().__init__(model, x, y, err)\n",
    "        self.func_code = make_func_code(describe(model)[1:])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "lsq = BetterLeastSquares(line, x_data, y_data, y_err)\n",
    "\n",
    "m = Minuit(lsq, a=0, b=0)\n",
    "m.migrad()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Report number of data points\n",
    "\n",
    "The minimum value of our least-squares function is asymptotically chi2-distributed. This is useful, because it allows us to judge the quality of the fit. iminuit automatically reports the reduced chi2 value $\\chi^2/n_\\text{dof}$ if the cost function reports the number of data points."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class EvenBetterLeastSquares(BetterLeastSquares):\n",
    "    @property\n",
    "    def ndata(self):\n",
    "        return len(self.x)\n",
    "\n",
    "lsq = EvenBetterLeastSquares(line, x_data, y_data, y_err)\n",
    "\n",
    "m = Minuit(lsq, a=0, b=0)\n",
    "m.migrad()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As a developer of a cost function, you have to make sure that its minimum value is chi2-distributed if you add the property `ndata` to your cost function, `Minuit` has no way of knowing that.\n",
    "\n",
    "For binned likelihoods, one can make the minimum value asymptotically chi2-distributed by applying transformations, see [Baker and Cousins, Nucl.Instrum.Meth. 221 (1984) 437-442](https://doi.org/10.1016/0167-5087(84)90016-4).\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3.8.13 ('venv': venv)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.13"
  },
  "vscode": {
   "interpreter": {
    "hash": "bdbf20ff2e92a3ae3002db8b02bd1dd1b287e934c884beb29a73dced9dbd0fa3"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
