HQ: A pipeline tool for The Joker ================================= This package simplifies and enables creating pipelines to run `The Joker `_ on large datasets of radial velocities. The primary way to use this tool is through the command-line interface ``hq`` that is installed when you pip install this package. .. .. toctree:: .. :maxdepth: 2 .. hq/index.rst Example pipeline ---------------- * **Initialize the run**: Create a folder / repository / project for your run of the pipeline as a place to host the input data files and all of the outputs generated by this tool. For example, if I were to run this on APOGEE DR17 visit data, I would create a new repository "apogee-dr17". I would then put the RV catalog (one RV measurement per row, with a column that represents a unique source ID) in "apogee-dr17/data". To initialize the HQ run, we will run ``hq init`` and specify a path that we want to store the run configuration files. For example, to store the HQ configuration files in "apogee-dr17/hq-config", I would run:: hq init --path apogee-dr17/hq-config This will create the path (if it does not exist already), and will copy in template configuration files that will be needed by ``hq`` to run the rest of the pipeline. In particular, this will create a "config.yml" file, which contains the actual configurable values, and a "prior.py" file, which contains the pymc3 model specification of the prior used by The Joker and MCMC to generate the orbital parameter samplings. * **Edit the config files**: You will need to edit both of these files to update the values as you would like the run to proceed. A number of the parameters in the generated config.yml file (here, "apogee-dr17/hq-config/config.yml") are required and have no default values. In particular, you must set the ``input_data_file`` parameter to the full path to the radial velocity data file you would like to run on. You may also want to set the ``cache_path`` parameter: This sets the location that HQ will use to store output data files. Here, for example, we may want to set this to "/full/path/to/apogee-dr17/cache". All of the required parameters are labeled with comments as ``# REQUIRED``. * (optional) **Define the run environment**: All of the ``hq`` commands accept passing in the run path containing the configuration files via the ``--path`` command flag. In this example, this would be the path to "apogee-dr17/hq-config". However, you can also set this globally in your environment by setting the ``$HQ_RUN_PATH`` environment variable so that you do not have to pass the path in to every command. For the rest of these examples, I will assume that you have set the ``$HQ_RUN_PATH`` to your run config path! * **Create the prior samples cache file**: TODO:: hq make_prior_cache * **Set up the tasks used to parallelize and deploy the run**: TODO:: hq make_tasks * **Run The Joker sampler on all stars**: TODO:: hq run_thejoker * (optional) **Fit a robust constant RV model to all sources**: TODO:: hq run_constant * **Analyze The Joker samplings**: to determine which stars are complete and which stars need to be followed up with standard MCMC:: hq analyze_joker * TODO: HQ_THEANO_PATH=/tmp/theano_cache * **Run standard MCMC on the unimodal samplings**: TODO:: hq run_mcmc * **Analyze the MCMC samplings**: TODO:: hq analyze_mcmc * **Combine the metadata files**: TODO:: hq combine_metadata .. toctree:: :caption: Development :hidden: GitHub Repository The Joker