Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add concept of "default program" and make prog argument optional #364

Merged
merged 5 commits into from
Nov 30, 2023

Conversation

osandov
Copy link
Owner

@osandov osandov commented Nov 1, 2023

This adds the concept of the "default program", which will make it optional to pass prog to most functions and helpers. E.g.,

for task in for_each_task(): ...

Instead of

for task in for_each_task(prog): ...

This will be set up automatically in the CLI, and library users can opt into it if they want to.

The first commit introduces the framework, and the second commit has some examples of how to convert helpers to use it.

I really like how this improves the ergonomics of the CLI. The tradeoff is a little bit of extra code in helpers that take a program. (There's also the slight tradeoff of a bit more documentation for users to read, although I expect that most users can remain blissfully unaware of this concept.)

@brenns10, I'd love your opinion on this idea.

@brenns10
Copy link
Contributor

brenns10 commented Nov 2, 2023

I'm of two minds, but for the most part, it seems nice. I think having a default program makes a lot of sense, especially for the CLI. I personally would try to avoid my programs depending on it, but I think for many helpers to support it makes sense, and for small scripts and things I would probably use it too. It's a 👍🏻 from me!

The thing I'd want to improve upon, from this PR, is the ergonomics from the helper side of things. I wonder if we could have some sort of decorator function to handle it? The big issue is disambiguating a None meaning "no program, use the default" and a None which may be the next argument to the function.

It's already a bit weird to write helpers that can optionally support a program as the first argument, and this would add to that.

@osandov
Copy link
Owner Author

osandov commented Nov 2, 2023

Good call on the decorator. I'll see if I can put something together that handles the existing cases.

For future helpers, we might be able to sidestep this by making prog keyword-only (although the inconsistency might be weird.)

@brenns10
Copy link
Contributor

brenns10 commented Nov 2, 2023

I'm not sure I love the keyword only argument approach personally, for two reasons. One is the inconsistency, like you said. The other is that it would require code like for_each_foo(prog=prog) which is more verbose, and it kinda incentivizes people to rely on the default prog when writing code, because it's easier to type.

If you think about it, the default prog should only be used at the API boundary, where CLI users call into a helper. Helper code like this would be a bug:

def for_each_bar(prog: ProgramOrDefault):
    if not prog:
        prog = get_default_prog()
    bar = prog["bar_list"]
    for foo in for_each_foo():  # omit prog from helper function call
        # do something with foo and bar

Sorry for not coming up with a more meaningful example, but hopefully you see what I mean. Once you have a prog object, you should always pass it into every function which requires it, to avoid future uses of get_default_prog(). Otherwise, if prog_1 is the default, and the user makes a function call with prog_2, you could accidentally switch between them in one of your helper calls. Making the prog argument keyword only makes it a bit harder to type, and the lazier among us might make subtle bugs :)

@osandov
Copy link
Owner Author

osandov commented Nov 2, 2023

That's a fair point, although it is somewhat mitigated by the fact that I am (intentionally) not setting the default program for the helper tests, so anything that forgets to pass the program explicitly will blow up.

The tradeoff of keeping the positional argument convention will be awkward helper implementations, but let's see if I can come up with a couple of decorators to reduce that.

@brenns10
Copy link
Contributor

brenns10 commented Nov 2, 2023

Oh that's smart, that should handle all of the issues in the drgn library, and downstream applications can follow suit if they want.

I tried my hand at a suggestion for this. I basically thought there were two common patterns to handle:

  1. Helper functions that basically just take one argument, prog, and maybe some non-drgn arguments (if they were drgn arguments, you could infer prog from them). For these, I made a decorator. The caveat here is that none of the type annotations reflect that you can omit the first argument -- I figured that if you care enough to run mypy, you shouldn't be relying on the default prog anyway. It's mainly for interactive use.
  2. Helper functions which take an Object, or a Program and a sensible default (e.g. for_each_mount() in this PR). These really just need a helper function to handle their logic so that we don't keep repeating the same thing over and over. I added a helper and a type union ProgOrObject = Union[Program, Object, None].

I think the docstrings would need updating (drgndoc could detect the decorator and add a notice saying "prog can be omitted if you've setup the default program").

Just a suggested paradigm, but I'd love to see yours too!

diff --git a/drgn/helpers/common/prog.py b/drgn/helpers/common/prog.py
new file mode 100644
index 0000000..4460450
--- /dev/null
+++ b/drgn/helpers/common/prog.py
@@ -0,0 +1,47 @@
+# Copyright (c) 2023, Oracle and/or its affiliates.
+# SPDX-License-Identifier: LGPL-2.1-or-later
+
+from functools import wraps
+from typing import Any, Callable, Tuple, Union
+
+from drgn import Object, Program, get_default_prog
+
+ProgOrObject = Union[Program, Object, None]
+
+
+def prog_and_object(
+    prog_or_object: ProgOrObject, default: Callable[[Program], Object]
+) -> Tuple[Program, Object]:
+    if isinstance(prog_or_object, Object):
+        return prog_or_object.prog_, prog_or_object
+    elif isinstance(prog_or_object, Program):
+        prog = prog_or_object
+    elif prog_or_object is None:
+        prog = get_default_prog()
+    return prog, default(prog)
+
+
+def use_default_prog(func):  # type: ignore
+    """
+    Decorator for using the default program if it is not provided.
+
+    This decorator can be applied to helper functions whose first argument is a
+    Program. It allows callers to either provide this argument, or omit it for
+    use in the interactive debugging case. The decorator also updates the
+    docstring to indicate that the first parameter is optional.
+    """
+
+    # The docstring as modified here will not get parsed by Sphinx, because the
+    # drgndoc system works at the parser-level, not by importing the module.
+    # However, it is still worth adding a disclaimer for interactive users who
+    # may call help(func).
+    func.__doc__ += "NOTE: The first argument (prog) may be omitted to use the default."
+
+    @wraps(func)
+    def wrapped(*args: Any, **kwargs: Any) -> Any:
+        if len(args) >= 1 and isinstance(args[0], Program):
+            return func(*args, **kwargs)
+        else:
+            return func(get_default_prog(), *args, **kwargs)
+
+    return wrapped
diff --git a/drgn/helpers/linux/block.py b/drgn/helpers/linux/block.py
index 5ea81a2..bb1f37c 100644
--- a/drgn/helpers/linux/block.py
+++ b/drgn/helpers/linux/block.py
@@ -14,8 +14,9 @@ Before that, they were represented by ``struct hd_struct``.
 
 from typing import Iterator
 
-from drgn import Object, Program, ProgramOrDefault, container_of, get_default_prog
+from drgn import Object, Program, container_of
 from drgn.helpers.common.format import escape_ascii_string
+from drgn.helpers.common.prog import use_default_prog
 from drgn.helpers.linux.device import MAJOR, MINOR, MKDEV
 from drgn.helpers.linux.list import list_for_each_entry
 
@@ -91,7 +92,8 @@ def _for_each_block_device(prog: Program) -> Iterator[Object]:
         yield from list_for_each_entry("struct device", devices, "knode_class.n_node")
 
 
-def for_each_disk(prog: ProgramOrDefault = None) -> Iterator[Object]:
+@use_default_prog  # type: ignore
+def for_each_disk(prog: Program) -> Iterator[Object]:
     """
     Iterate over all disks in the system.
 
@@ -103,7 +105,7 @@ def for_each_disk(prog: ProgramOrDefault = None) -> Iterator[Object]:
     # block_device::bd_device. We start by assuming that the kernel has this
     # commit and fall back to the old path if that fails.
     have_bd_device = True
-    for device in _for_each_block_device(prog or get_default_prog()):
+    for device in _for_each_block_device(prog):
         if have_bd_device:
             try:
                 bdev = container_of(device, "struct block_device", "bd_device")
diff --git a/drgn/helpers/linux/fs.py b/drgn/helpers/linux/fs.py
index 2d489aa..fad74cc 100644
--- a/drgn/helpers/linux/fs.py
+++ b/drgn/helpers/linux/fs.py
@@ -12,17 +12,9 @@ Linux virtual filesystem (VFS) layer, including mounts, dentries, and inodes.
 import os
 from typing import Iterator, Optional, Tuple, Union, overload
 
-from drgn import (
-    IntegerLike,
-    Object,
-    Path,
-    Program,
-    ProgramOrDefault,
-    container_of,
-    get_default_prog,
-    sizeof,
-)
+from drgn import IntegerLike, Object, Path, Program, container_of, sizeof
 from drgn.helpers.common.format import escape_ascii_string
+from drgn.helpers.common.prog import ProgOrObject, prog_and_object
 from drgn.helpers.linux.list import (
     hlist_empty,
     hlist_for_each_entry,
@@ -276,7 +268,7 @@ def mount_fstype(mnt: Object) -> bytes:
 
 
 def for_each_mount(
-    prog_or_ns: Union[ProgramOrDefault, Object] = None,
+    prog_or_ns: ProgOrObject = None,
     src: Optional[Path] = None,
     dst: Optional[Path] = None,
     fstype: Optional[Union[str, bytes]] = None,
@@ -291,12 +283,7 @@ def for_each_mount(
     :param fstype: Only include mounts with this filesystem type.
     :return: Iterator of ``struct mount *`` objects.
     """
-    if prog_or_ns is None:
-        ns = get_default_prog()["init_task"].nsproxy.mnt_ns
-    elif isinstance(prog_or_ns, Program):
-        ns = prog_or_ns["init_task"].nsproxy.mnt_ns
-    else:
-        ns = prog_or_ns
+    prog, ns = prog_and_object(prog_or_ns, lambda p: p["init_task"].nsproxy.mnt_ns)
     if src is not None:
         src = os.fsencode(src)
     if dst is not None:

@brenns10 brenns10 mentioned this pull request Nov 3, 2023
@osandov osandov changed the title [RFC] Add concept of "default program" Add concept of "default program" and make prog argument optional Nov 30, 2023
@osandov
Copy link
Owner Author

osandov commented Nov 30, 2023

I'm pretty happy with this now. I pushed the decorator idea a bit further and was able to avoid the need for manual overloads thanks to #371. I'll probably merge this tomorrow.

Copy link
Contributor

@brenns10 brenns10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's happening! Wow :) I especially like your test cases for the decorators, and how much boilerplate code you managed to delete related to default arguments!

I think it makes sense that the documentation is mostly updated to omit the prog arguments. I worry that this will result in some confusion for developers who are writing scripts or applications that don't rely on the default, but the full function signatures are there too, so I think it's the right trade-off.

My only major suggestion is, can we update the contrib/ptdrgn.py accordingly?

diff --git a/contrib/ptdrgn.py b/contrib/ptdrgn.py
index 1e31f74..fdd39a4 100644
--- a/contrib/ptdrgn.py
+++ b/contrib/ptdrgn.py
@@ -159,6 +159,7 @@ def run_interactive(
         "offsetof",
         "reinterpret",
         "sizeof",
+        "stack_trace",
     ]
     for attr in drgn_globals:
         init_globals[attr] = getattr(drgn, attr)
@@ -183,6 +184,11 @@ For help, type help(drgn).
     if globals_func:
         init_globals = globals_func(init_globals)
 
+    try:
+        old_default_prog = drgn.get_default_prog()
+    except drgn.NoDefaultProgramError:
+            old_default_prog = None
+
     old_path = list(sys.path)
     # The ptpython history file format is different from a standard readline
     # history file since it must handle multi-line input, and it includes some
@@ -191,6 +197,7 @@ For help, type help(drgn).
     histfile = os.path.expanduser("~/.drgn_history.ptpython")
     try:
         sys.path.insert(0, "")
+        drgn.set_default_prog(prog)
 
         print(banner)
         embed(
@@ -200,6 +207,7 @@ For help, type help(drgn).
             configure=configure,
         )
     finally:
+        drgn.set_default_prog(old_default_prog)
         sys.path[:] = old_path
 

if wrapper.__doc__ is None:
wrapper.__doc__ = ""
wrapper.__doc__ += (
":param prog: Program, which may be omitted to use the default program."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a trailing newline added too? It seems like you're already counting on there being a trailing newline on __doc__ if it exists, so that you don't append to the last line.

Comment on lines 67 to 70
def takes_program_or_default(f: "TakesProgram[P, R]") -> "TakesProgramOrDefault[P, R]":
"""
Wrap a function taking a :class:`~drgn.Program` so that it uses the
:ref:`default-program` if omitted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading this docstring, it primed me to believe that this function would never use Object.prog_. Due to the language "it uses the default program if omitted" - in my reading, that means that there are two cases: either a program is provided by the user, or the default program is used.

In the implementation (and one example you give), that expectation is not correct - there's the third case: the user provides an object, in which case you use Object.prog_. I think it would be nice to have a second sentence in here that says something along the lines of If the first argument is a :class:~drgn.Object, then its associated program will be used rather than the default program.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm overloading the term default program in a few places to either mean "the program set by set_default_prog()" or "the program used if one is not explicitly given". I'll try to clarify that. Maybe the latter could be referred to as the "default program rules", or I can say the "program is automatically determined if omitted", linking to the rules.

@osandov
Copy link
Owner Author

osandov commented Nov 30, 2023

My only major suggestion is, can we update the contrib/ptdrgn.py accordingly?

Good catch. I can fold your patch in unless you want to push a commit somewhere for me to add to this PR.

@brenns10
Copy link
Contributor

Good catch. I can fold your patch in unless you want to push a commit somewhere for me to add to this PR.

I already dropped the unstaged changes from my tree 😆
Probably easiest for you to fold it in, if that's alright with you. It was a rote translation of your changes in drgn.cli.run_interactive() anyway, no real authorship on my part!

Some feedback that I've gotten that resonated with me was that it feels
silly and too verbose to always pass prog to helpers, like
for_each_task(prog), find_task(prog, pid), etc. Passing prog makes sense
from the library point of view, but it's not great for interactive
usability. And even when you include usage as a library, the vast
majority of use cases only need one program.

So, let's introduce the concept of the "default program". It has a
getter (get_default_prog()) and a setter (set_default_prog()). The CLI
sets it automatically. Library users can do it manually if they want to.
It is a per-thread setting.

Upcoming commits will update all of our helpers and functions that take
a Program to make it optional and default to the default program.

P.S. This was inspired by asyncio, which has many interfaces that take
an optional loop parameter but default to the current loop. Cf.
asyncio.get_event_loop() and asyncio.set_event_loop().

Signed-off-by: Omar Sandoval <[email protected]>
To simplify converting helpers to allow omitting the prog argument, add
a couple of decorators that take care all of the messy details.

@takes_program_or_default is based on an example from Stephen Brennan,
and the rest handle progressively more complex calling conventions.

Also update drgndoc to be aware of these decorators and add some
boilerplate to the generated documentation.

Signed-off-by: Omar Sandoval <[email protected]>
For the most part, this simply entails adding the correct decorator.
There are some notable conversions:

- All of the memory management helpers that already had
  (prog: Program, addr: IntegerLike) and (addr: Object) overloads are
  now much simpler and support keyword arguments.
- Helpers that already took a Program or an Object are also now much
  simpler and support keyword arguments.
- Helpers that previously took a Program and positional parameters with
  a default value (path_lookup(), for_each_mount(), print_mounts()) had
  those parameters converted to keyword-only. This is not
  backwards-compatible, unfortunately.

Signed-off-by: Omar Sandoval <[email protected]>
A few people have told me that they frequently forget whether
stack_trace() is a Program method or a function. Now that we have a
default program, it can be both.

Signed-off-by: Omar Sandoval <[email protected]>
Using the default program is nicer to read and type, so prefer it for
documentation.

Signed-off-by: Omar Sandoval <[email protected]>
Copy link
Contributor

@brenns10 brenns10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that you're necessarily waiting on my review, but looks good to me with the rewording!

@osandov osandov merged commit d3edc0a into main Nov 30, 2023
38 checks passed
@osandov osandov deleted the default-prog branch November 30, 2023 22:44
@osandov
Copy link
Owner Author

osandov commented Nov 30, 2023

Oof, I should've added test-all-python-versions to this one. Will fix...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants