User Preference

Editing User Preferences in aPhyloGeo

The aPhyloGeo software allows users to customize their preferences through a YAML configuration file. This file includes a set of parameters for easy handling. Below is an example configuration file with explanations for each parameter:

Configuration File Example

file_name: './datasets/example/geo.csv'
specimen: 'id'
names: ['id', 'ALLSKY_SFC_SW_DWN', 'T2M', 'PRECTOTCORR', 'QV2M', 'WS10M']
bootstrap_threshold: 0
dist_threshold: 60
window_size: 200
step_size: 100
bootstrap_amount: 100
data_names: ['ALLSKY_SFC_SW_DWN_newick', 'T2M_newick', 'QV2M_newick', 'PRECTOTCORR_newick', 'WS10M_newick']
reference_gene_dir: './datasets/example'
reference_gene_file: 'sequences.fasta'
makeDebugFiles: True
alignment_method: '1' # Options: 1: pairwiseAligner, 2: MUSCLE, 3: CLUSTALW, 4: MAFFT
distance_method: '1' # Options: 1: Least-Square distance, 2: Robinson-Foulds distance, 3: Euclidean distance (DendroPY)
fit_method: '1' # Options: 1: Wider Fit by elongating with Gap (starAlignment), 2: Narrow-fit prevent elongation with gap when possible
tree_type: '1' # Options: 1: BioPython consensus tree, 2: FastTree application
rate_similarity: 90
method_similarity: '1' # Options: 1: Hamming distance, 2: Levenshtein distance, 3: Damerau-Levenshtein distance, 4: Jaro similarity, 5: Jaro-Winkler similarity, 6: Smith–Waterman similarity, 7: Jaccard similarity, 8: Sørensen-Dice similarity

User Preferences Options:

File Name: Path to the input data file (./datasets/example/geo.csv in the example).
Specimen: Identifier for the specimens in the dataset (id in the example).
Names: List of column names in the dataset (['id', 'ALLSKY_SFC_SW_DWN', 'T2M', 'PRECTOTCORR', 'QV2M', 'WS10M'] in the example).
Bootstrap Threshold: Number of replicates threshold to be generated for each sub-MSA.
Distance Threshold: Distance threshold between genetic tree and climatic tree for each sub-MSA.
Window Size: Size of the sliding window.
Step Size: Sliding window advancement step.
Bootstrap Amount: Number of bootstraps to be generated.
Data Names: List of newick file names for each dataset.
Reference Gene Directory: Directory containing reference gene data (./datasets/example in the example).
Reference Gene File: File containing reference gene sequences (sequences.fasta in the example).
Make Debug Files: Option to generate debug files (True or False).
Alignment Method: Algorithm selection for sequence alignment (Options: 1: pairwiseAligner, 2: MUSCLE, 3: CLUSTALW, 4: MAFFT in the example).
Distance Method: Distance selection (Options: 1: Least-Square distance, 2: Robinson-Foulds distance, 3: Euclidean distance (DendroPY) in the example).
Fit Method: Gap selection elongation (Options: 1: Wider Fit by elongating with Gap (starAlignment), 2: Narrow-fit prevent elongation with gap when possible in the example).
Tree Inference Method: The choice of inference methods (Options: 1: BioPython consensus tree, 2: FastTree application in the example).
Rate Similarity: The rate similarity between sequences to reduce and remove the sub-MSA with a high value of similarity.
Method Similarity: The choice of similarity methods (Options: 1: Hamming distance, 2: Levenshtein distance, 3: Damerau-Levenshtein distance, 4: Jaro similarity, 5: Jaro-Winkler similarity, 6: Smith–Waterman similarity, 7: Jaccard similarity, 8: Sørensen-Dice similarity in the example).

Please email us at: [email protected] for any questions or feedback.

Wiki

Available analyses

Misc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

User Preference

Editing User Preferences in aPhyloGeo

Configuration File Example

User Preferences Options:

Clone this wiki locally