-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Small changes to simplify build_primer_pairs(). #89
Conversation
WalkthroughThe pull request modifies the In Possibly related PRs
Suggested reviewers
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
prymer/api/picking.py
(3 hunks)prymer/api/primer_pair.py
(2 hunks)tests/api/test_primer_pair.py
(3 hunks)
🔇 Additional comments (6)
prymer/api/picking.py (2)
144-145
: Good refactor.
Moving span calculation to PrimerPair class improves modularity.
162-162
: Clean parameter passing.
Direct use of amp_span reduces unnecessary variables.
tests/api/test_primer_pair.py (2)
Line range hint 430-439
: LGTM: Clearer error message for reference mismatch
More concise error message that maintains the same test coverage.
453-453
: LGTM: Improved error clarity for primer order
More descriptive error message that better explains the validation failure.
prymer/api/primer_pair.py (2)
87-91
: Assignment of _amplicon
using calculate_amplicon_span
is clear and correct
233-263
: Static method calculate_amplicon_span
is well-defined and correctly implemented
object.__setattr__( | ||
self, | ||
"_amplicon", | ||
PrimerPair.calculate_amplicon_span(self.left_primer, self.right_primer), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion Is this a good opportunity to refactor to make amplicon
a cached_property
, rather than mucking about with a private field and setattr
?
e.g.
@cached_property
def amplicon(self) -> Span:
"""The interval spanned by the pair's amplicon."""
return self.calculate_amplicon_span(self.left_primer, self.right_primer)
and then no need for the post-init
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's questionable. For better or worse, PrimerPair relies on the checks in calculate_amplicon_span()
to enforce that the pairing makes sense ... so we could make it a cached property and the still have a post_init that accesses it, or replicate the checks ... or separate the checks into yet another function. None of which seem obviously better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd really like to remove _amplicon
as a "private" field.
As a consequence of the current implementation, both asdict()
and fields()
return a field that isn't accepted by the class constructor. This is unusual behavior and requires the user to remember and manually remove the field.
Could we make amplicon
a cached_property
, and update the post-init checks to reference that property? That protects all the cases I'm aware of motivating the current implementation:
- Mutation-free access to an
amplicon
"attribute". - The value of
amplicon
is still derived from the input primers.
As a bonus, it removes the need to use setattr
(currently necessary because we're trying to mutate a frozen dataclass after instantiation).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried ... but failed. Changed it to a cached property and tried calling it from post_init() to run the validation. Now I get this in tests, which is super unhelpful and I'm not sure why:
ERROR tests/api/test_primer_pair.py - TypeError: No '__dict__' attribute on 'PrimerPair' instance to cache 'amplicon' property.
ERROR tests/api/test_primer_pair.py - TypeError: No '__dict__' attribute on 'PrimerPair' instance to cache 'amplicon' property.```
Going to merge without making the change, and we can circle back to it separately.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #89 +/- ##
=======================================
Coverage 96.66% 96.66%
=======================================
Files 26 26
Lines 1707 1710 +3
Branches 331 333 +2
=======================================
+ Hits 1650 1653 +3
Misses 31 31
Partials 26 26 ☔ View full report in Codecov by Sentry. 🚨 Try these New Features:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (2)
prymer/api/primer_pair.py (2)
267-267
: Fix copy-pasted comment.Comment is incorrect - this check verifies the right primer ends after left primer ends.
242-244
: Clarify overlap rules in docstring.Specify allowed overlap percentage between primers.
- a Span starting at the first base of the left primer and ending at the last base of - the right primer + A Span starting at the first base of the left primer and ending at the last base of + the right primer. Primers may overlap by up to 50% of their length.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
prymer/api/primer_pair.py
(2 hunks)tests/api/test_primer_pair.py
(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/api/test_primer_pair.py
# Require that the left primer starts before the right primer | ||
if left_primer.span.start > right_primer.span.start: | ||
raise ValueError( | ||
"The reference must be the same across primers in a pair; received " | ||
f"left primer ref: {self.left_primer.span.refname}, " | ||
f"right primer ref: {self.right_primer.span.refname}" | ||
"Left primer does not start before the right primer. " | ||
f"Left primer span: {left_primer.span}, " | ||
f"Right primer span: {right_primer.span}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add validation for primer overlap.
Current check only verifies start positions. Also validate that left primer doesn't overlap significantly with right primer.
Add this check after the existing validation:
if left_primer.span.start > right_primer.span.start:
raise ValueError(
"Left primer does not start before the right primer. "
f"Left primer span: {left_primer.span}, "
f"Right primer span: {right_primer.span}"
)
+
+# Ensure primers don't have significant overlap
+overlap = min(left_primer.span.end, right_primer.span.end) - max(left_primer.span.start, right_primer.span.start)
+if overlap > len(left_primer.bases) / 2: # Allow up to 50% overlap
+ raise ValueError(
+ f"Primers overlap too much ({overlap} bases). "
+ f"Left primer span: {left_primer.span}, "
+ f"Right primer span: {right_primer.span}"
+ )
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
# Require that the left primer starts before the right primer | |
if left_primer.span.start > right_primer.span.start: | |
raise ValueError( | |
"The reference must be the same across primers in a pair; received " | |
f"left primer ref: {self.left_primer.span.refname}, " | |
f"right primer ref: {self.right_primer.span.refname}" | |
"Left primer does not start before the right primer. " | |
f"Left primer span: {left_primer.span}, " | |
f"Right primer span: {right_primer.span}" | |
# Require that the left primer starts before the right primer | |
if left_primer.span.start > right_primer.span.start: | |
raise ValueError( | |
"Left primer does not start before the right primer. " | |
f"Left primer span: {left_primer.span}, " | |
f"Right primer span: {right_primer.span}" | |
) | |
# Ensure primers don't have significant overlap | |
overlap = min(left_primer.span.end, right_primer.span.end) - max(left_primer.span.start, right_primer.span.start) | |
if overlap > len(left_primer.bases) / 2: # Allow up to 50% overlap | |
raise ValueError( | |
f"Primers overlap too much ({overlap} bases). " | |
f"Left primer span: {left_primer.span}, " | |
f"Right primer span: {right_primer.span}" | |
) |
def test_calculate_amplicon_span() -> None: | ||
left = Oligo(name="l", bases="AACCGGTTAA", tm=60, penalty=1, span=Span("chr1", 50, 59)) | ||
right = Oligo(name="l", bases="AACCGGTTAA", tm=60, penalty=1, span=Span("chr1", 150, 159)) | ||
assert PrimerPair.calculate_amplicon_span(left, right) == Span("chr1", 50, 159) | ||
|
||
left = Oligo(name="l", bases="AACCGGTTAA", tm=60, penalty=1, span=Span("chr2", 50, 59)) | ||
right = Oligo(name="l", bases="AACCGGTTAA", tm=60, penalty=1, span=Span("chr3", 150, 159)) | ||
with pytest.raises(ValueError, match="different references"): | ||
PrimerPair.calculate_amplicon_span(left, right) | ||
|
||
left = Oligo(name="l", bases="AACCGGTTAA", tm=60, penalty=1, span=Span("chr1", 150, 159)) | ||
right = Oligo(name="l", bases="AACCGGTTAA", tm=60, penalty=1, span=Span("chr1", 50, 59)) | ||
with pytest.raises(ValueError, match="Left primer does not start before the right primer"): | ||
PrimerPair.calculate_amplicon_span(left, right) | ||
|
||
left = Oligo(name="l", bases="AACCGGTTAAACGTT", tm=60, penalty=1, span=Span("chr1", 150, 164)) | ||
right = Oligo(name="l", bases="AACCGGTTAA", tm=60, penalty=1, span=Span("chr1", 150, 159)) | ||
with pytest.raises(ValueError, match="Right primer ends before left primer ends"): | ||
PrimerPair.calculate_amplicon_span(left, right) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note I usually try to use pytest.mark.parametrize
to parallelize test execution as much as possible.
Tests exit after the first failed assertion, so if you bundle test cases in a single test, you run the risk of obscuring later failures. If you parametrize, you hit as many failures as possible in a single test run, and don't find yourself stuck in a loop of addressing one test case, running the test suite, and discovering a previously hidden failure.
I usually have one test parametrized over "good" cases, with each test case paired with its expected output, and one test parametrized over "bad" cases, where each case is expected to raise an exception.
Since here the span
is the primary variable being manipulated by each case, I'd write something like the following:
@pytest.mark.parametrize(
"left_primer,right_primer,expected_amplicon",
[
(Span("chr1", 50, 59), Span("chr1", 150, 159), Span("chr1", 50, 159)),
]
)
def test_calculate_amplicon_span(left_primer: Span, right_primer: Span, expected_amplicon: Span) -> None:
"""The amplicon should span from the start of the left primer to the end of the right primer."""
# TODO add logic to build an `Oligo` from the test `Span`
actual_amplicon: Span = calculate_amplicon_span(left_primer, right_primer)
assert actual_amplicon == expected_amplicon
@pytest.mark.parametrize(
"left_primer,right_primer,error_msg",
[
(Span("chr2", 50, 59), Span("chr3", 150, 159), "different references"),
],
)
def test_calculate_amplicon_span_raises(left_primer: Span, right_primer: Span, error_msg: str) -> None:
"""An error should be raised if the spans are on different references, or if the left primer is not to the left of the right primer."""
with pytest.raises(ValueError, match=error_msg):
calculate_amplicon_span(left_primer, right_primer)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is, to me, a stylistic thing, and I come down very strongly on the other side. I almost never use pytest.mark.parametrize
because I personally find that it makes tests less maintainable, harder to read etc.
If the tests aren't as simple as these, then I tend to break them up into multiple test functions.
See this discussion for context: #89 (comment) This PR also updates the mypy config to exclude the `mkdocs` directories.
No description provided.