You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Let me start off by saying the tool looks great! I've had a major problem dealing with incomplete/mis-annotated/truncated genes in pangenome analyses, and no other tools have been designed to deal with such issues -- I think this is a great advantage of PEPPA!
Before trying the tool out on my own dataset, I tried running the provided dataset. It runs just fine, but perhaps I am misunderstanding something about the output. The .gff file produced by PEPPA.py contains thousands of entries, but the .matrix file produced by PEPPA_parser.py only contains ~>200 genes, and many ortholog groups noted in the .gff file are absent from the .matrix file.
Is this because this is a reduced/sample dataset designed to run quickly? The pangenome is reported as 223 genes, with a core genome of 31 genes, with an average number of genes/genome at 88...in a full analysis, all genes identified in the .gff would be included in the .matrix file, would they not, provided they pass pseudogene filtering/etc.?
Thank you,
Conrad
The text was updated successfully, but these errors were encountered:
Hello,
Let me start off by saying the tool looks great! I've had a major problem dealing with incomplete/mis-annotated/truncated genes in pangenome analyses, and no other tools have been designed to deal with such issues -- I think this is a great advantage of PEPPA!
Before trying the tool out on my own dataset, I tried running the provided dataset. It runs just fine, but perhaps I am misunderstanding something about the output. The .gff file produced by PEPPA.py contains thousands of entries, but the .matrix file produced by PEPPA_parser.py only contains ~>200 genes, and many ortholog groups noted in the .gff file are absent from the .matrix file.
Is this because this is a reduced/sample dataset designed to run quickly? The pangenome is reported as 223 genes, with a core genome of 31 genes, with an average number of genes/genome at 88...in a full analysis, all genes identified in the .gff would be included in the .matrix file, would they not, provided they pass pseudogene filtering/etc.?
Thank you,
Conrad
The text was updated successfully, but these errors were encountered: