Merge pull request #55 from idia-astro/casa6

Pipeline V2.0 release
idia-astro · Aug 16, 2022 · 9985626 · 9985626
2 parents 0605a94 + 2814800
commit 9985626
Show file tree

Hide file tree

Showing 32 changed files with 1,468 additions and 652 deletions.
diff --git a/.DS_Store b/.DS_Store
diff --git a/README.md b/README.md
@@ -4,7 +4,7 @@
 
 # The IDIA MeerKAT Pipeline
 
-The IDIA MeerKAT pipeline is a radio interferometric calibration pipeline designed to process MeerKAT data. **It is under heavy development, and so far only implements the cross-calibration steps, and quick-look images. Please report any issues you find in the [GitHub issues](https://github.com/idia-astro/pipelines/issues).**
+The IDIA MeerKAT pipeline is a radio interferometric calibration pipeline designed to process MeerKAT data. **It is under regular development, and so far implements cross-calibration, selfcal-calibration, and science imaging. Please report any issues you find in the [GitHub issues](https://github.com/idia-astro/pipelines/issues).**
 
 ## Requirements
 
@@ -14,37 +14,59 @@ This pipeline is designed to run on the Ilifu cluster, making use of SLURM and M
 
 **Note: It is not necessary to copy the raw data (i.e. the MS) to your working directory. The first step of the pipeline does this for you by creating an MMS or MS, and does not attempt to manipulate the raw data (e.g. stored in `/idia/projects` - see [data format](https://idia-pipelines.github.io/docs/processMeerKAT/Example-Use-Cases/#data-format)).**
 
-### 1. In order to use the `processMeerKAT.py` script, source the `setup.sh` file:
+## 1. Setup the pipeline in your environment
+
+**Please note : These docs are for an upcoming release of the pipeline. At the time of writing (July 2022) please use the CASA6 branch of the pipeline, which is the pre-release version. The master branch will be updated at the time of release.**
+
+In order to use the `processMeerKAT.py` script, source the `setup.sh` file, which can be done on [ilifu](https://docs.ilifu.ac.za/#/) as
 
         source /idia/software/pipelines/master/setup.sh
 
-which will add the correct paths to your `$PATH` and `$PYTHONPATH` in order to correctly use the pipeline. We recommend you add this to your `~/.profile`, for future use.
+which will add the correct paths to your `$PATH` and `$PYTHONPATH` in order to correctly use the pipeline. You could consider adding this to your `~/.profile` or `~/.bashrc` for future use.
 
 ### 2. Build a config file:
 
+#### a. For continuum/spectral line processing :
+
         processMeerKAT.py -B -C myconfig.txt -M mydata.ms
 
+#### b. For polarization processing :
+
+        processMeerKAT.py -B -C myconfig.txt -M mydata.ms -P
+
+#### c. Including self-calibration :
 
-This defines several variables that are read by the pipeline while calibrating the data, as well as requesting resources on the cluster. The config file parameters are described by in-line comments in the config file itself wherever possible.
+        processMeerKAT.py -B -C myconfig.txt -M mydata.ms -2
+
+#### d. Including science imaging :
+
+        processMeerKAT.py -B -C myconfig.txt -M mydata.ms -I
+
+This defines several variables that are read by the pipeline while calibrating the data, as well as requesting resources on the cluster. The [config file parameters](https://idia-pipelines.github.io/docs/processMeerKAT/config-files) are described by in-line comments in the config file itself wherever possible. The `[-P --dopol]` option can be used in conjunction with the `[-2 --do2GC]` and `[-I --science_image]` options to enable polarization calibration as well as [self-calibration](https://idia-pipelines.github.io/docs/processMeerKAT/self-calibration-in-processmeerkat) and [science imaging](https://idia-pipelines.github.io/docs/processMeerKAT/science-imaging-in-processmeerkat).
 
 ### 3. To run the pipeline:
 
         processMeerKAT.py -R -C myconfig.txt
 
 This will create `submit_pipeline.sh`, which you can then run with `./submit_pipeline.sh` to submit all pipeline jobs to the SLURM queue.
 
-Other convenient scripts are also created that allow you to monitor and (if necessary) kill the jobs. `summary.sh` provides a brief overview of the status of the jobs in the pipeline, `findErrors.sh` checks the log files for commonly reported errors, and `killJobs.sh` kills all the jobs from the current run of the pipeline, ignoring any other jobs you might have running.
+Other convenience scripts are also created that allow you to monitor and (if necessary) kill the jobs.
+
+* `summary.sh` provides a brief overview of the status of the jobs in the pipeline
+* `findErrors.sh` checks the log files for commonly reported errors (after the jobs have run)
+* `killJobs.sh` kills all the jobs from the current run of the pipeline, ignoring any other (unrelated) jobs you might have running.
+* `cleanup.sh` wipes all the intermediate data products created by the pipeline. This is intended to be launched after the pipeline has run and the output is verified to be good.
 
-For help, run `processMeerKAT.py -h`, which provides a brief description of all the command line arguments.
+For help, run `processMeerKAT.py -h`, which provides a brief description of all the [command line options](https://idia-pipelines.github.io/docs/processMeerKAT/using-the-pipeline#command-line-options).
 
-## Using multiple spectral windows (new in V1.1)
+## Using multiple spectral windows (new in v1.1)
 
-Starting with V1.1 of the processMeerKAT pipeline, the default behaviour is to split up the MeerKAT band into 16 spectral windows (SPWs), and process each concurrently. This results in a few major usability changes as outlined below:
+Starting with v1.1 of the processMeerKAT pipeline, the default behaviour is to split up the MeerKAT band into several spectral windows (SPWs), and process each concurrently. This results in a few major usability changes as outlined below:
 
-1. **Calibration output** : Since the calibration is performed independently per SPW, all the output specific to that SPW is within it's own directory. Output such as the calibration tables, logs, plots etc. per SPW can be found within each SPW directory.
+1. **Calibration output** : Since the calibration is performed independently per SPW, all the output specific to that SPW is within its own directory. Output such as the calibration tables, logs, plots etc. per SPW can be found within each SPW directory.
 
 2. **Logs in the top level directory** : Logs in the top level directory (*i.e.,* the directory where the pipeline was launched) correspond to the scripts in the `precal_scripts` and `postcal_scripts` variables in the config file. These scripts are run from the top level before and after calibration respectively. By default these correspond to the scripts to calculate the reference antenna (if enabled), partition the data into SPWs, and concat the individual SPWs back into a single MS/MMS.
 
-More detailed information about SPW splitting is found [here](/docs/processMeerKAT/using-the-pipeline#spw-splitting).
+More detailed information about SPW splitting is found [here](/docs/processMeerKAT/config-files#spw-splitting).
 
 The documentation can be accessed on the [pipelines website](https://idia-pipelines.github.io/docs/processMeerKAT), or on the [Github wiki](https://github.com/idia-astro/pipelines/wiki).
diff --git a/processMeerKAT/RACS.fits.gz b/processMeerKAT/RACS.fits.gz
diff --git a/processMeerKAT/aux_scripts/concat.py b/processMeerKAT/aux_scripts/concat.py
@@ -1,4 +1,4 @@
-#Copyright (C) 2020 Inter-University Institute for Data Intensive Astronomy
+#Copyright (C) 2022 Inter-University Institute for Data Intensive Astronomy
 #See processMeerKAT.py for license details.
 
 import os
@@ -10,6 +10,13 @@
 from config_parser import validate_args as va
 import bookkeeping
 
+from casatasks import *
+logfile=casalog.logfile()
+casalog.setlogfile('logs/{SLURM_JOB_NAME}-{SLURM_JOB_ID}.casa'.format(**os.environ))
+from casatools import msmetadata,image
+msmd = msmetadata()
+ia = image()
+
 import logging
 from time import gmtime
 logging.Formatter.converter = gmtime
@@ -25,10 +32,10 @@ def check_output(fname,files,pattern,out,job='concat',filetype='image'):
         logger.info('Output file "{0}" already exists. Skipping {1}.'.format(out,job))
         return None
     elif len(files) == 0:
-        logger.warn("Didn't find any {0}s with '{1}'".format(filetype,pattern))
+        logger.warning("Didn't find any {0}s with '{1}'".format(filetype,pattern))
         return None
     elif len(files) == 1:
-        logger.warn("Only found 1 {0} with '{1}'. Will copy to this directory.".format(filetype,pattern))
+        logger.warning("Only found 1 {0} with '{1}'. Will copy to this directory.".format(filetype,pattern))
         copytree(files[0], out)
         return None
     return files
@@ -71,11 +78,12 @@ def do_concat(visname, fields, dirs='*MHz'):
 
     for field in [fields.targetfield,fields.gainfields,fields.extrafields]:
         if field != '':
-            for target in field.split(','):
-                fname = msmd.namesforfields(int(target))[0]
+            for fname in field.split(','):
+                if fname.isdigit():
+                    fname = msmd.namesforfields(int(fname))[0]
 
                 #Concat tt0 images (into continuum cube)
-                suffix = 'images/*{0}*image.tt0'.format(fname)
+                suffix = 'images/*.{0}*image.tt0'.format(fname)
                 files,pattern = get_infiles(dirs,suffix)
                 out = '{0}.{1}.contcube'.format(filebase,fname)
                 images = check_output(fname,files,pattern,out,job='imageconcat',filetype='image')
@@ -92,7 +100,7 @@ def do_concat(visname, fields, dirs='*MHz'):
                         logger.error("Output image '{0}' attempted to write but was not written.".format(out))
 
                 #Concat images (into continuum cube)
-                suffix = 'images/*{0}*image'.format(fname)
+                suffix = 'images/*.{0}*image'.format(fname)
                 files,pattern = get_infiles(dirs,suffix)
                 out = '{0}.{1}.contcube'.format(filebase,fname)
                 images = check_output(fname,files,pattern,out,job='imageconcat',filetype='image')
@@ -109,7 +117,7 @@ def do_concat(visname, fields, dirs='*MHz'):
                         logger.error("Output image '{0}' attempted to write but was not written.".format(out))
 
                 #Concat MSs
-                suffix = '*{0}*.ms'.format(fname)
+                suffix = '*.{0}*.ms'.format(fname)
                 files,pattern = get_infiles(dirs,suffix)
                 out = '{0}.{1}.ms'.format(filebase,fname)
                 MSs = check_output(fname,files,pattern,out,job='concat',filetype='MS')
@@ -118,14 +126,14 @@ def do_concat(visname, fields, dirs='*MHz'):
                     logger.info('Concatenating MSs with following command:')
                     logger.info('concat(vis={0}, concatvis={1})'.format(MSs,out))
                     concat(vis=MSs, concatvis=out)
-                    if target == fields.targetfield.split(',')[0]:
+                    if fname == fields.targetfield.split(',')[0]:
                         newvis = out
 
                     if not os.path.exists(out):
                         logger.error("Output MS '{0}' attempted to write but was not written.".format(out))
 
                 #Concat MMSs
-                suffix = '*{0}*.mms'.format(fname)
+                suffix = '*.{0}*.mms'.format(fname)
                 files,pattern = get_infiles(dirs,suffix)
                 out = '{0}.{1}.mms'.format(filebase,fname)
                 MMSs = check_output(fname,files,pattern,out,job='virtualconcat',filetype='MMS')
@@ -134,7 +142,7 @@ def do_concat(visname, fields, dirs='*MHz'):
                     logger.info('Concatenating MMSs with following command:')
                     logger.info('virtualconcat(vis={0}, concatvis={1})'.format(MMSs,out))
                     virtualconcat(vis=MMSs, concatvis=out)
-                    if target == fields.targetfield.split(',')[0]:
+                    if fname == fields.targetfield.split(',')[0]:
                         newvis = out
 
                     if not os.path.exists(out):
@@ -162,4 +170,4 @@ def main(args,taskvals):
 
 if __name__ == '__main__':
 
-    bookkeeping.run_script(main)
+    bookkeeping.run_script(main,logfile)
diff --git a/processMeerKAT/aux_scripts/show_ant_stats.py b/processMeerKAT/aux_scripts/show_ant_stats.py
@@ -1,7 +1,7 @@
-#Copyright (C) 2020 Inter-University Institute for Data Intensive Astronomy
+#Copyright (C) 2022 Inter-University Institute for Data Intensive Astronomy
 #See processMeerKAT.py for license details.
 
-#!/usr/bin/env python2.7
+#!/usr/bin/env python3
 import sys, os
 import numpy as np
 import matplotlib.pyplot as plt
@@ -14,7 +14,7 @@
 if len(args) > 1:
     if sys.argv[1] == '-h':
         print('Usage: {0} filename bins'.format(os.path.split(sys.argv[0])[1]))
-    else:
+    elif sys.argv[1] != '--config':
         fname = sys.argv[1]
         bins = int(sys.argv[2])
 
@@ -45,4 +45,4 @@
 plt.xlabel('Antenna')
 plt.ylabel('Flagged Percentage')
 plt.legend()
-plt.savefig('{0}_plot.png'.format(os.path.splitext(fname)[0]))
+plt.savefig('{0}_plot.png'.format(os.path.splitext(fname)[0]))