-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: Primer -> Oligo, PrimerLike -> OligoLike #51
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #51 +/- ##
==========================================
- Coverage 97.25% 96.76% -0.49%
==========================================
Files 25 26 +1
Lines 1603 1701 +98
Branches 303 328 +25
==========================================
+ Hits 1559 1646 +87
- Misses 23 29 +6
- Partials 21 26 +5 ☔ View full report in Codecov by Sentry. |
31c6c3f
to
e97ca61
Compare
014e7bc
to
d32a7ac
Compare
6b57128
to
e7bf242
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we're doing this we should also think about the future where we want to enable running primer3 in the mode where it will produce left and right primers and a probe all in one go.
In the examples, can you please use more realistic values. My 2c is that documentation that uses unrealistic values reduces my confidence in the documentation. Examples are to use more realistic Tms, penalty values that are >=0, sequences that are not G*20 etc.
I don't love that this PR also introduces a ton of new parameters that we weren't previously exposing for the homodimer, end stability and hairpin thresholds ... I would ideally prefer to see those in a separate PR where we can discuss them a bit more.
docs/overview.md
Outdated
@@ -3,8 +3,8 @@ | |||
The `prymer` Python library is intended to be used for three main purposes: | |||
|
|||
1. [Clustering targets](#clustering-targets) into larger amplicons prior to designing primers. | |||
2. [Designing primers](#designing-primers) (left or right) or primer pairs using Primer3 for each target from (1). | |||
3. [Build and Picking a set of primer pairs](#build-and-picking-primer-pairs) from the designed primer pairs produced in (2). | |||
2. [Designing oligos](#designing-primers) using Primer3 for each target from (1). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's helpful to have the documentation still refer to primers, primer pairs and probes rather than "oligos".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@emmcauley Let's revert this line and address more substantial documentation updates in subsequent PRs.
prymer/api/oligo.py
Outdated
""" | ||
# Oligo Class and Methods | ||
|
||
This module contains a class and class methods to represent an oligo designed by Primer3. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To @msto's point elsewhere, I don't think they always have to be designed by primer3. It would be helpful to think about whether we might make more values optional (or have a default value) and also to document that they don't have to come from primer3.
I could see making penalty optional, or defaulting it to 1.0 for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional penalty would be nice for me, but we could do that in a subsequent PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is a noble but perhaps lofty goal to expand the scope of this package beyond primer3
, certainly beyond the scope of this PR.
I agree we should be mindful of making decisions that are overly specific to primer3
. However I'd like to be cognizant that the content of this file is largely the same as the original primer.py
, and would like to limit this PR's scope to the planned refactoring (Primer
-> Oligo
)
prymer/primer3/primer3.py
Outdated
primer_designs: Primer3Result = Primer3Result( | ||
filtered_designs=valid_primer_designs, failures=failures | ||
design_candidates: Primer3Result = Primer3Result( | ||
filtered_designs=valid_oligo_designs, failures=failures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This came up elsewhere in conversation that perhaps filtered_designs
should just be designs
? If primer3
implemented a dinuc check, would we feel the need to call this filtered_designs
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the docstring and changed the variable from filtered_designs
to designs
. These are filtered for out-of-spec characteristics and the corresponding failures
list contains the reasons why (so I have been thinking of these designs as "filtered" because they have undergone one round of filtering for primer-specific properties like GC content, TM, etc.).
primer_weights: Optional[PrimerAndAmpliconWeights] = None | ||
probe_weights: Optional[ProbeWeights] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason not to allow default weights? I.e. does primer3 blow up if you set probe weights and then design left primers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes -- if you allow default weights for both tasks, but do not supply both sets of params, Primer3 complains, because those weights are part of the objective scoring function, and it doesn't know how far off optimal it is when optimal is left undefined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find your answer confusing. I'm asking "if you provide too many weights, isn't that ok?", but the last line of your answer - "it doesn't know how far off optimal it is when optimal is left undefined." - makes it sound like you're answering "what if we didn't provide enough weights"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To use your example: Primer3 blows up if you set probe weights and then design left primers. By giving ProbeWeights()
, you are telling Primer3 how you want the resultant designs to be scored. But Primer3 complains because you didn't give it ProbeParameters()
to score against -- that is what I meant by "optimal is undefined".
Primer3 does not know what to do with the weights you provide it in the absence of corresponding parameters. Because each set of parameters is now defaulted to None
, that is the context we are working within.
Specifically, it is the PRIMER_INTERNAL_WT_GC_PERCENT_*
weights that cause the problem, here is the error when you specify PRIMER_INTERNAL_WT_GC_PERCENT_LT
and PRIMER_INTERNAL_WT_GC_PERCENT_GT
but ask for left primers only:
PRIMER_ERROR=Hyb probe GC content is part of objective function while optimum gc_content is not defined
=
primer3_core: Hyb probe GC content is part of objective function while optimum gc_content is not defined
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Circling back to this with one more thought. Would it be hard to still have these defaulted on the input, but then depending on the task, only translate the appropriate weights to tags and send them to primer3?
prymer/primer3/primer3_parameters.py
Outdated
# Set melting temperature thresholds to be 10 degrees less than the minimum primer tm | ||
default_thermo_tm: float = self.primer_tms.min - 10.0 | ||
object.__setattr__(self, "primer_max_self_any_thermo", default_thermo_tm) | ||
object.__setattr__(self, "primer_max_self_end_thermo", default_thermo_tm) | ||
object.__setattr__(self, "primer_max_hairpin_thermo", default_thermo_tm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just do this above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point in the code, you have validated other user inputs and you know that you have valid PrimerAndAmpliconParameters
. It made more sense to me to set these attributes after you know that you have valid input and can continue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @tfenne @emmcauley
I'm optimistic we can get this across the line in the next day or so.
@emmcauley I agree with Tim that this should be split up into a few PRs - one to execute the grand Primer
-> Oligo
renaming, and additional PR(s) to introduce the functional changes (e.g. ProbeParameters
, the various new parameters and attributes)
@tfenne If this becomes a purely refactoring PR, I think we can merge in without much additional review. I agree the points you've highlighted are worth addressing, but I think we'll be able to progress more quickly if we keep the scope of this PR focused on the renaming.
docs/overview.md
Outdated
@@ -3,8 +3,8 @@ | |||
The `prymer` Python library is intended to be used for three main purposes: | |||
|
|||
1. [Clustering targets](#clustering-targets) into larger amplicons prior to designing primers. | |||
2. [Designing primers](#designing-primers) (left or right) or primer pairs using Primer3 for each target from (1). | |||
3. [Build and Picking a set of primer pairs](#build-and-picking-primer-pairs) from the designed primer pairs produced in (2). | |||
2. [Designing oligos](#designing-primers) using Primer3 for each target from (1). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@emmcauley Let's revert this line and address more substantial documentation updates in subsequent PRs.
prymer/api/oligo.py
Outdated
""" | ||
# Oligo Class and Methods | ||
|
||
This module contains a class and class methods to represent an oligo designed by Primer3. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is a noble but perhaps lofty goal to expand the scope of this package beyond primer3
, certainly beyond the scope of this PR.
I agree we should be mindful of making decisions that are overly specific to primer3
. However I'd like to be cognizant that the content of this file is largely the same as the original primer.py
, and would like to limit this PR's scope to the planned refactoring (Primer
-> Oligo
)
prymer/api/oligo_like.py
Outdated
OligoLikeType = TypeVar("OligoLikeType", bound=OligoLike) | ||
"""Type variable for classes generic over `OligoLike` types.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed - @emmcauley could you please remove these two lines while you're in here?
prymer/api/primer.py
Outdated
@@ -14,14 +14,14 @@ | |||
```python | |||
>>> from prymer.api.span import Span, Strand | |||
>>> primer_span = Span(refname="chr1", start=1, end=20) | |||
>>> primer = Primer(tm=70.0, penalty=-123.0, span=primer_span) | |||
>>> primer = Oligo(tm=70.0, penalty=-123.0, span=primer_span) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed - @emmcauley could you please remove the duplicate docs?
I think the complete contents of the file should look something like this:
"""This module is deprecated - see prymer/api/oligo.py"""
import warnings
from dataclasses import dataclass
from prymer.api.oligo import Oligo
@dataclass(frozen=True, init=True, slots=True)
class Primer(Oligo):
"""
A deprecated alias for `Oligo`.
This class exists to maintain backwards compatibility with earlier releases of `prymer`, and may be removed in a future version.
"""
warnings.warn(
"The Primer class was deprecated, use Oligo instead",
DeprecationWarning,
stacklevel=2,
)
f"target length: {design_input.target.length}, " | ||
f"minimal probe size: {design_input.probe_params.probe_sizes.min}" | ||
) | ||
design_region = design_input.target |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@emmcauley This line caught my eye due to the lack of test coverage.
This looks like a functional change - I'd suggest backing this out to be introduced by another PR (e.g. #50 where PickHybProbeOnly
is introduced).
2. [Designing](#designing-primers) primers (single and paired) and internal hybridization probes using Primer3 for each target from (1). | ||
3. [Build and Picking a set of primer pairs](#build-and-picking-primer-pairs) from the design candidates produced in (2). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we converged on design()
instead of design_oligos()
or design_primers()
bases: Optional[str] = None | ||
tail: Optional[str] = None | ||
|
||
def __post_init__(self) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all of these methods derive from the original Primer
class and are unchanged, just ported from primer.py
to oligo.py
@@ -124,7 +123,3 @@ def _strand_to_location_string(self) -> str: | |||
case _: # pragma: no cover | |||
# Not calculating coverage on this line as it should be impossible to reach | |||
assert_never(f"Encountered unhandled Strand value: {self.span.strand}") | |||
|
|||
|
|||
PrimerLikeType = TypeVar("PrimerLikeType", bound=PrimerLike) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed per PR comment
@@ -279,17 +287,17 @@ def is_acceptable_primer_pair(primer_pair: PrimerPair, params: FilteringParams) | |||
|
|||
|
|||
def build_primer_pairs( | |||
lefts: Iterable[Primer], | |||
rights: Iterable[Primer], | |||
left_primers: Iterable[Oligo], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
per Tim's request on OG PR, updating "lefts" and "rights" to "left_primers" and "right_primers"
|
||
``` | ||
""" | ||
"""This module is deprecated - see prymer/api/oligo.py""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updating deprecation of Primer
via Matt/Tim
@@ -37,7 +37,7 @@ class methods to represent a primer pair. The primer pair is comprised of a lef | |||
Span(refname='chr1', start=21, end=100, strand=<Strand.POSITIVE: '+'>) | |||
|
|||
>>> list(primer_pair) | |||
[Primer(name=None, tm=70.0, penalty=-123.0, span=Span(refname='chr1', start=1, end=20, strand=<Strand.POSITIVE: '+'>), bases='GGGGGGGGGGGGGGGGGGGG', tail=None), Primer(name=None, tm=70.0, penalty=-123.0, span=Span(refname='chr1', start=101, end=120, strand=<Strand.NEGATIVE: '-'>), bases='TTTTTTTTTTTTTTTTTTTT', tail=None)] | |||
[Oligo(name=None, tm=70.0, penalty=-123.0, span=Span(refname='chr1', start=1, end=20, strand=<Strand.POSITIVE: '+'>), bases='GGGGGGGGGGGGGGGGGGGG', tail=None), Oligo(name=None, tm=70.0, penalty=-123.0, span=Span(refname='chr1', start=101, end=120, strand=<Strand.NEGATIVE: '-'>), bases='TTTTTTTTTTTTTTTTTTTT', tail=None)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because we stripped out the thermo-related attributes, I reverted the doc-test examples
failures: ordered list of Primer3Failures detailing design failure reasons and corresponding | ||
count | ||
""" | ||
|
||
filtered_designs: list[PrimerLikeType] | ||
designs: list[OligoLikeType] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
per Tim's request, updating filtered_designs
to designs
@@ -308,17 +309,10 @@ def get_design_sequences(self, region: Span) -> tuple[str, str]: | |||
hard_masked = "".join(soft_masked_list) | |||
return soft_masked, hard_masked | |||
|
|||
@staticmethod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this method was renamed to a function has_acceptable_dinuc_run
and made into a module-level function
max_amplicon_length=design_input.params.max_amplicon_length, | ||
min_primer_length=design_input.params.min_primer_length, | ||
) | ||
design_region: Span |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kept this in, @tfenne flagging this for you because I wasn't sure if we decided to take it out.
@@ -760,3 +744,44 @@ def _create_design_region( | |||
) | |||
|
|||
return design_region | |||
|
|||
|
|||
def _check_design_results(design_input: Primer3Input, design_results: dict[str, str]) -> int: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was existing code that I re-organized into a private function because I was trying to limit the sprawl of _build_oligos
return count | ||
|
||
|
||
def _has_acceptable_dinuc_run(design_input: Primer3Input, oligo_design: Oligo) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same -- existing code that I re-orged into a module-level function and renamed to be more descriptive
6f47a75
to
932c823
Compare
@@ -0,0 +1,181 @@ | |||
import pytest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are new tests and the intention is to make sure that when one or both sets of Params are provided, that Primer3Input
is instantiated as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of comments, but please squash and merge once you've addressed or told me why you won't address!
primer_weights: Optional[PrimerAndAmpliconWeights] = None | ||
probe_weights: Optional[ProbeWeights] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Circling back to this with one more thought. Would it be hard to still have these defaulted on the input, but then depending on the task, only translate the appropriate weights to tags and send them to primer3?
…ates to imports and tests
a8951d5
to
c0e869e
Compare
So the task can be imported from `prymer.primer3` like the other tasks. @tfenne I should have thought of this while we were discussing #51, but do you have any thoughts on whether this task might be renamed for consistency with the other tasks and ease of discovery? e.g. `PickHybProbeOnlyTask`, or `DesignProbeTask`
The PR refactors the
Primer
object and plannedProbe
object into the same dataclass, now calledOligo
. AnOligo
object can represent single primer and probe designs. The corresponding base class, previously calledPrimerLike
, has been refactored toOligoLike
.The new
Oligo
object has the attributes that aPrimer
object had (tm
,penalty
,span
and optional attributesbases
,tail
) plus three new optional attributes:self_any_th
,self_end_th
, andhairpin_th
. These three attributes are emitted from Primer3 for probe design and, after an update toPrimerAndAmpliconParams
, will also be emitted for primer designs.For backwards compatibility, the
Primer
object now wraps anOligo
object and also emits aDeprecationWarning
.I left
PrimerPair
largely untouched -- aPrimerPair
is now composed of twoOligo
objects but the rest of the functionality ofPrimerPair
I left intact.