-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: Primer -> Oligo, PrimerLike -> OligoLike #51
Merged
Merged
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,207 @@ | ||
""" | ||
# Oligo Class and Methods | ||
This module contains a class and class methods to represent an oligo (e.g., designed by Primer3). | ||
Oligos can represent single primer and/or internal probe designs. | ||
Class attributes include the base sequence, melting temperature, and the score of the oligo. The | ||
mapping of the oligo to the genome is also stored. | ||
Optional attributes include naming information and a tail sequence to attach to the 5' end of the | ||
oligo (if applicable). Optional attributes also include the thermodynamic results from Primer3. | ||
## Examples of interacting with the `Oligo` class | ||
```python | ||
>>> from prymer.api.span import Span, Strand | ||
>>> oligo_span = Span(refname="chr1", start=1, end=20) | ||
>>> oligo = Oligo(tm=70.0, penalty=-123.0, span=oligo_span, bases="AGCT" * 5) | ||
>>> oligo.longest_hp_length() | ||
emmcauley marked this conversation as resolved.
Show resolved
Hide resolved
|
||
1 | ||
>>> oligo.length | ||
20 | ||
>>> oligo.name is None | ||
True | ||
>>> oligo = Oligo(tm=70.0, penalty=-123.0, span=oligo_span, bases="GACGG"*4) | ||
>>> oligo.longest_hp_length() | ||
3 | ||
>>> oligo.untailed_length() | ||
20 | ||
>>> oligo.tailed_length() | ||
20 | ||
>>> primer = oligo.with_tail(tail="GATTACA") | ||
>>> primer.untailed_length() | ||
20 | ||
>>> primer.tailed_length() | ||
27 | ||
>>> primer = primer.with_name(name="fwd_primer") | ||
>>> primer.name | ||
'fwd_primer' | ||
``` | ||
Oligos may also be written to a file and subsequently read back in, as the `Oligo` class is an | ||
`fgpyo` `Metric` class: | ||
```python | ||
>>> from pathlib import Path | ||
>>> left_span = Span(refname="chr1", start=1, end=20) | ||
>>> left = Oligo(tm=70.0, penalty=-123.0, span=left_span, bases="G"*20) | ||
>>> right_span = Span(refname="chr1", start=101, end=120) | ||
>>> right = Oligo(tm=70.0, penalty=-123.0, span=right_span, bases="T"*20) | ||
>>> path = Path("/tmp/path/to/primers.txt") | ||
>>> Oligo.write(path, left, right) # doctest: +SKIP | ||
>>> primers = Oligo.read(path) # doctest: +SKIP | ||
>>> list(primers) # doctest: +SKIP | ||
[ | ||
Oligo(tm=70.0, penalty=-123.0, span=amplicon_span, bases="G"*20), | ||
Oligo(tm=70.0, penalty=-123.0, span=amplicon_span, bases="T"*20) | ||
] | ||
``` | ||
""" | ||
|
||
from dataclasses import dataclass | ||
from dataclasses import replace | ||
from typing import Any | ||
from typing import Callable | ||
from typing import Dict | ||
from typing import Optional | ||
|
||
from fgpyo.fasta.sequence_dictionary import SequenceDictionary | ||
from fgpyo.sequence import longest_dinucleotide_run_length | ||
from fgpyo.sequence import longest_homopolymer_length | ||
from fgpyo.util.metric import Metric | ||
|
||
from prymer.api.oligo_like import MISSING_BASES_STRING | ||
from prymer.api.oligo_like import OligoLike | ||
from prymer.api.span import Span | ||
|
||
|
||
@dataclass(frozen=True, init=True, kw_only=True, slots=True) | ||
class Oligo(OligoLike, Metric["Oligo"]): | ||
"""Stores the properties of the designed oligo. | ||
Oligos can include both single primer and internal probe designs. The penalty score of the | ||
design is emitted by Primer3 and controlled by the corresponding design parameters. | ||
The penalty for a primer is set by the combination of `PrimerAndAmpliconParameters` and | ||
`PrimerWeights`, whereas a probe penalty is set by `ProbeParameters` and `ProbeWeights`. | ||
Attributes: | ||
tm: the calculated melting temperature of the oligo | ||
penalty: the penalty or score for the oligo | ||
span: the mapping of the primer to the genome | ||
bases: the base sequence of the oligo (excluding any tail) | ||
tail: an optional tail sequence to put on the 5' end of the primer | ||
name: an optional name to use for the primer | ||
""" | ||
|
||
tm: float | ||
penalty: float | ||
span: Span | ||
bases: Optional[str] = None | ||
tail: Optional[str] = None | ||
|
||
def __post_init__(self) -> None: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. all of these methods derive from the original |
||
super(Oligo, self).__post_init__() | ||
|
||
def longest_hp_length(self) -> int: | ||
"""Length of longest homopolymer in the oligo.""" | ||
if self.bases is None: | ||
return 0 | ||
emmcauley marked this conversation as resolved.
Show resolved
Hide resolved
|
||
else: | ||
return longest_homopolymer_length(self.bases) | ||
|
||
@property | ||
def length(self) -> int: | ||
"""Length of un-tailed oligo.""" | ||
return self.span.length | ||
|
||
def untailed_length(self) -> int: | ||
"""Length of un-tailed oligo.""" | ||
return self.span.length | ||
|
||
def tailed_length(self) -> int: | ||
"""Length of tailed oligo.""" | ||
return self.span.length if self.tail is None else self.span.length + len(self.tail) | ||
|
||
def longest_dinucleotide_run_length(self) -> int: | ||
"""Number of bases in the longest dinucleotide run in a oligo. | ||
A dinucleotide run is when length two repeat-unit is repeated. For example, | ||
TCTC (length = 4) or ACACACACAC (length = 10). If there are no such runs, returns 2 | ||
(or 0 if there are fewer than 2 bases).""" | ||
return longest_dinucleotide_run_length(self.bases) | ||
|
||
def with_tail(self, tail: str) -> "Oligo": | ||
"""Returns a copy of the oligo with the tail sequence attached.""" | ||
return replace(self, tail=tail) | ||
|
||
def with_name(self, name: str) -> "Oligo": | ||
"""Returns a copy of oligo object with the given name.""" | ||
return replace(self, name=name) | ||
|
||
def bases_with_tail(self) -> Optional[str]: | ||
""" | ||
Returns the sequence of the oligo prepended by the tail. | ||
If `tail` is None, only return `bases`. | ||
""" | ||
if self.tail is None: | ||
return self.bases | ||
return f"{self.tail}{self.bases}" | ||
|
||
def to_bed12_row(self) -> str: | ||
"""Returns the BED detail format view: | ||
https://genome.ucsc.edu/FAQ/FAQformat.html#format1.7""" | ||
bed_coord = self.span.get_bedlike_coords() | ||
return "\t".join( | ||
map( | ||
str, | ||
[ | ||
self.span.refname, # contig | ||
bed_coord.start, # start | ||
bed_coord.end, # end | ||
self.id, # name | ||
500, # score | ||
self.span.strand.value, # strand | ||
bed_coord.start, # thick start | ||
bed_coord.end, # thick end | ||
"100,100,100", # color | ||
1, # block count | ||
f"{self.length}", # block sizes | ||
"0", # block starts (relative to `start`) | ||
], | ||
) | ||
) | ||
|
||
def __str__(self) -> str: | ||
""" | ||
Returns a string representation of this oligo | ||
""" | ||
# If the bases field is None, replace with MISSING_BASES_STRING | ||
bases: str = self.bases if self.bases is not None else MISSING_BASES_STRING | ||
return f"{bases}\t{self.tm}\t{self.penalty}\t{self.span}" | ||
|
||
@classmethod | ||
def _parsers(cls) -> Dict[type, Callable[[str], Any]]: | ||
return { | ||
Span: lambda value: Span.from_string(value), | ||
} | ||
|
||
@staticmethod | ||
def compare(this: "Oligo", that: "Oligo", seq_dict: SequenceDictionary) -> int: | ||
"""Compares this oligo to that oligo by their span, ordering references using the given | ||
sequence dictionary. | ||
Args: | ||
this: the first oligo | ||
that: the second oligo | ||
seq_dict: the sequence dictionary used to order references | ||
Returns: | ||
-1 if this oligo is less than the that oligo, 0 if equal, 1 otherwise | ||
""" | ||
return Span.compare(this=this.span, that=that.span, seq_dict=seq_dict) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,22 +1,22 @@ | ||
""" | ||
# Class and Methods for primer-like objects | ||
# Class and Methods for oligo-like objects | ||
The `PrimerLike` class is an abstract base class designed to represent primer-like objects, | ||
such as individual primers or primer pairs. This class encapsulates common attributes and | ||
The `OligoLike` class is an abstract base class designed to represent oligo-like objects, | ||
such as individual primers and probes or primer pairs. This class encapsulates common attributes and | ||
provides a foundation for more specialized implementations. | ||
In particular, the following methods/attributes need to be implemented: | ||
- [`span()`][prymer.api.primer_like.PrimerLike.span] -- the mapping of the primer-like | ||
- [`span()`][prymer.api.oligo_like.OligoLike.span] -- the mapping of the oligo-like | ||
object to the genome. | ||
- [`bases()`][prymer.api.primer_like.PrimerLike.bases] -- the bases of the primer-like | ||
- [`bases()`][prymer.api.oligo_like.OligoLike.bases] -- the bases of the oligo-like | ||
object, or `None` if not available. | ||
- [`to_bed12_row()`][prymer.api.primer_like.PrimerLike.to_bed12_row] -- the 12-field BED | ||
representation of this primer-like object. | ||
- [`to_bed12_row()`][prymer.api.oligo_like.OligoLike.to_bed12_row] -- the 12-field BED | ||
representation of this oligo-like object. | ||
See the following concrete implementations: | ||
- [`Primer`][prymer.api.primer.Primer] -- a class to store an individual primer | ||
- [`Primer`][prymer.api.oligo.Oligo] -- a class to store an individual oligo | ||
- [`PrimerPair`][prymer.api.primer_pair.PrimerPair] -- a class to store a primer pair | ||
""" | ||
|
@@ -25,7 +25,6 @@ | |
from abc import abstractmethod | ||
from dataclasses import dataclass | ||
from typing import Optional | ||
from typing import TypeVar | ||
from typing import assert_never | ||
|
||
from fgpyo.sequence import gc_content | ||
|
@@ -38,9 +37,9 @@ | |
|
||
|
||
@dataclass(frozen=True, init=True, slots=True) | ||
class PrimerLike(ABC): | ||
class OligoLike(ABC): | ||
""" | ||
An abstract base class for primer-like objects, such as individual primers or primer pairs. | ||
An abstract base class for oligo-like objects, such as individual primers or primer pairs. | ||
Attributes: | ||
name: an optional name to use for the primer | ||
|
@@ -67,12 +66,12 @@ def __post_init__(self) -> None: | |
@property | ||
@abstractmethod | ||
def span(self) -> Span: | ||
"""Returns the mapping of the primer-like object to a genome.""" | ||
"""Returns the mapping of the oligo-like object to a genome.""" | ||
|
||
@property | ||
@abstractmethod | ||
def bases(self) -> Optional[str]: | ||
"""Returns the base sequence of the primer-like object.""" | ||
"""Returns the base sequence of the oligo-like object.""" | ||
|
||
@property | ||
def percent_gc_content(self) -> float: | ||
|
@@ -88,7 +87,7 @@ def percent_gc_content(self) -> float: | |
@property | ||
def id(self) -> str: | ||
""" | ||
Returns the identifier for the primer-like object. This shall be the `name` | ||
Returns the identifier for the oligo-like object. This shall be the `name` | ||
if one exists, otherwise a generated value based on the location of the object. | ||
""" | ||
if self.name is not None: | ||
|
@@ -98,7 +97,7 @@ def id(self) -> str: | |
|
||
@property | ||
def location_string(self) -> str: | ||
"""Returns a string representation of the location of the primer-like object.""" | ||
"""Returns a string representation of the location of the oligo-like object.""" | ||
return ( | ||
f"{self.span.refname}_{self.span.start}_" | ||
+ f"{self.span.end}_{self._strand_to_location_string()}" | ||
|
@@ -107,7 +106,7 @@ def location_string(self) -> str: | |
@abstractmethod | ||
def to_bed12_row(self) -> str: | ||
""" | ||
Formats the primer-like into 12 tab-separated fields matching the BED 12-column spec. | ||
Formats the oligo-like into 12 tab-separated fields matching the BED 12-column spec. | ||
See: https://genome.ucsc.edu/FAQ/FAQformat.html#format1 | ||
""" | ||
|
||
|
@@ -124,7 +123,3 @@ def _strand_to_location_string(self) -> str: | |
case _: # pragma: no cover | ||
# Not calculating coverage on this line as it should be impossible to reach | ||
assert_never(f"Encountered unhandled Strand value: {self.span.strand}") | ||
|
||
|
||
PrimerLikeType = TypeVar("PrimerLikeType", bound=PrimerLike) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. removed per PR comment |
||
"""Type variable for classes generic over `PrimerLike` types.""" |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we converged on
design()
instead ofdesign_oligos()
ordesign_primers()