Skip to content

Latest commit

 

History

History
executable file
·
425 lines (351 loc) · 19.8 KB

shebang.org

File metadata and controls

executable file
·
425 lines (351 loc) · 19.8 KB

Executable Org files

This block acts as a shebang that makes Org files executable on bash, bash --posix, zsh, dash, pwsh, and powershell. It may also work on other posix shells, but they have not been tested. See the support matrix for the most up-to-date accounting. For pwsh and powershell you must symlink to have a .ps1 file extension.

The shebang block must come before the first line of the file that does not start with #.

In this file the block is visible to illustrate how it works. It uses #+header: :exports code so that it is easy to copy and paste the block into other files with :exports none set so that it is invisible for export.

For more see the EmacsConf 2021 talk on executable org files.

Support matrix

shellstatus${shell} shebang.org./shebang.orgruns via
bashsupportedyesyesbash
bash --posixsupportedyesyesbash --posix
zshsupportedyesyessh
dashsupportedyes >= 0.5.12 [fn:dashorg]yessh
pwshsupporteduse pwsh shebang.ps1 [fn:pwshorg]use ./shebang.ps1pwsh
powershellsupporteduse powershell shebang.ps1use ./shebang.ps1powershell
tcshworkaroundnoish [fn:cshorg]sh
fishpartialnoyessh

[fn:pwshorg] In principle pwsh shebang.org could work, however there is no way to get the file name to pass to emacs without digging around in /proc/$PID/cmdline and that will only work on linux. This is more or less what you need, the problem is the parens in the call to split. Also this is almost completely irrelevant because none of this is possible on windows powershell and there is pretty much zero chance that anyone would be running pwsh as their user shell on linux and not be entirely ok with symlinking to .ps1.

$file = (Get-Content /proc/${PID}/cmdline).Split([char]0x00)[1]

[fn:dashorg] If you are on a system that has dash as sh there is a bug in versions <dash-0.5.12 where set -e is not honored when a redirect failed to be created from a non-forking command grouping.

set -e; { echo -run;} > ""; echo +print bug $?

Fix is f42ee97f9e6fa15b7b6d85bb2faace4cadc1613e on the dash repo. It looks like debian already has a patch to backport the fix.

The shebang block will work on versions <=dash-0.5.12 but beware that if the variable $null is set then set -e will not be honored and variables can leak in from the environment. See the explication for more details.

[fn:cshorg] This block is know not to work on tcsh and csh due to the use of "${@}". However, there is a workaround which is to put a single space at the start of the file before # -*- mode: org -*- on the first line. The space causes tcsh to run the file via sh.

Windows

On windows org files must be symlinked to have a .ps1 file extension. You can use the following powershell function to create the symlinks. Beware that certain posix tools for windows such as git-bash have a version of ln that doesn’t actually create symlinks.

function make-link ($target, $link) { New-Item -Path $link -ItemType SymbolicLink -Value $target }

You can then run the following.

make-link shebang.org shebang.ps1

If you want to use the file as a command create an alias.

New-Alias test-orgstrap-shebang (Get-ChildItem shebang.ps1).FullName -Force

There is a rare bug that can happen if you try to run a file with dos line endings on a system that expects posix line endings. If you do this you will encounter errors if there are any “blank” lines before the shebang block because bash and friends will try to run a command named the carriage return variable \r. To fix this either remove the blank lines or add a # at the start of the line.

Details

Shell

The following is an explication of the shell lines that make the shebang block portable.

sh exit on error

In powershell the is effectively a noop.

set -e "-C" "-e" "-e"

This is a real doozy. We want to run set -Ce for posix sh so that if something is assigned to the variable $null we don’t overwrite a file with the same name, and we immediately want to exit if null is set in that case. We also want to exit if null is set to a path that cannot be created .e.g null=does/not/exist. Doing this is sufficient to protect the rest of the block from further shell insanity.

In powershell this calls the Set-Variable commandlet and it assigns a value of -e to the variable ${-e}. Ideally we wouldn’t do anything at all here but Set-Value requires that -e (exclude) receive an argument, and then it requires that a variable name and a variable value also be passed, otherwise it will grab stdin.

We exploit the fact that set accepts repetition of the same option multiple times, so we pass "-e" twice more so that we don’t overwrite any elements of the argv.

sh set value of variable $null to point to /dev/null

In powershell this is effectively a noop.

{ null=/dev/null;} > "${null:=/dev/null}"

In sh this line ensures that if for some insane reason the variable $null has been set that we override it to point to /dev/null.

Under normal circumstances when $null is unset, this will merely assign /dev/null to the variable $null twice.

Under the unlikely circumstance that the variable $null is set to some value an empty file bearing the name that is the value stored in the variable $null will be created and that file will remain empty.

If for some crazy reason the value in $null is a name that cannot be created, e.g. because it points to a non-existent directory (e.g. export null=does/not/exist), then set -e will exit and no further commands will run due to a failure to create the redirect file.

In powershell this creates a script block and redirects it without evaluating it to the variable named null:=/dev/null which is assumed to be undefined on powershell. It does NOT redirect to the variable $null which IS always bound in powershell and is what we are trying to replicate in sh with this line.

sh make sure that variables are empty and that = is on the path

In powershell this is effectively a noop.

{ args=;file=;MyInvocation=;__p=$(mktemp -d);touch ${__p}/=;chmod +x ${__p}/=;__op=$PATH;PATH=${__p}:$PATH;} > "${null}"

We set variables to the empty string so that there is no chance that an existing value might sneak through from the environment.

Both $args and "${@}" are passed to emacs and they should always xor because powerhsell uses $args and sh and friends use $@. This ensures $args is null if for whatever reason it was set.

Annoyingly we have to use mktemp -d in order to add = to the path because not only does dash not support the function keyword, but it also arbitrarily prevents defining a function with the name =. As a result the only portable way to get = on path is to create an executable file for it.

mktemp has not been standardized as part of posix. However, I have tested the default behavior of mktemp -d for the variants provided by gnu, busybox, macos, and FreeBSD and they all produce paths with no spaces. This means that the use of ${__p} without quotes should be safe. See https://unix.stackexchange.com/q/614808 for more.

In powerhsell the curly braces demarcate a script block which defers evaluation. This means that as long as you don’t put anything too syntactically evil inside, powershell won’t do anything except try to print it stdout, which we squash by dumping to $null.

{
    args=;
    file=;
    MyInvocation=;
    __p=$(mktemp -d);
    touch ${__p}/=;
    chmod +x ${__p}/=;
    __op=$PATH;
    PATH=${__p}:$PATH;
} > "${null}"

powershell assign $file

In sh this line is effectively a noop.

$file = $MyInvocation.MyCommand.Source

We assign both powerhsell and sh equivalents to the same variable to simplify passing it to emacs later in the block.

When = is on path as an empty file calling = returns 0 and since $file is null this line is equivalent to running /bin/true $MyInvocation.MyCommand.Source which prevents the presence of the periods on the line from causing errors.

The spaces before and after = are valid for assignments in powerhsell important for this line to be a noop in sh.

sh assign $file, remove = from PATH, and clean up after mktemp

{ file=$0;PATH=$__op;rm ${__p}/=;rmdir ${__p};} > "${null}"
{
    file=$0;
    PATH=$__op;
    rm ${__p}/=;
    rmdir ${__p};
} > "${null}"

Invoke emacs

emacs -batch -no-site-file -eval "(org-shebang)" "${file}" -- ${args} "${@}"

The exact use of $args or ${args} and "${@}" is critical for emacs to receive the correct values in argv.

${args} is used instead of $args in the event that in sh someone somehow has a, ar, or arg bound as a variable.

Critically ${args} must NOT be quoted, otherwise powershell will pass a single string rather than an array.

Critically "${@}" must BE quoted, otherwise sh will split args with spaces and pass them as individual arguments to emacs.

Note that $@ MUST NOT BE ASSIGNED TO ANOTHER VARIABLE. The behavior of assigning $@ to another variable is unspecified. See https://unix.stackexchange.com/a/532163 and <https://pubs.opengroup.org/onlinepubs/9699919799.2018edition/ utilities/V3_chap02.html#tag_18_05_02>

Note that "(org-shebang)" is an imagined future builtin implementation of the elisp that is explicated below.

bash shebang.org --test "w s" 1 2>&1
dash shebang.org --test "w s" 1 2>&1
zsh  shebang.org --test "w s" 1 2>&1
sh   shebang.org --test "w s" 1 2>&1
pwsh shebang.ps1 --test "w s" 1 2>&1

Exit after we finish running the file in emacs

exit

Keep powershell syntax checking happy

In sh this line never runs and is never parsed.

<# powershell open

powershell parses the entire contents of a .ps1 file to ensure that it is well formed before running any individual command.

In sh we don’t have to worry about this because the semantics of sh are to operate line by line, so in principle we can put anything we want after the call to exit and sh won’t ever care.

Emacs Lisp

A breakdown of the elisp that appears in the -eval string.

(let (vc-follow-symlinks) ; don't follow symlinks as there is no way
  ;; to prevent them from opening in `org-mode' due to an oversight
  ;; in `vc-follow-link' if for whatever reason you need to work
  ;; from the truename of the file then the adjustment can be made
  ;; in the orgstrap block itself

  (defun orgstrap--confirm-eval (l _) (not (memq (intern l) '(elisp emacs-lisp))))
  ;; allow elisp blocks to run without prompting, this bypasses the
  ;; usual orgstrap safeguards but when running as a script there are
  ;; other mechanisms that preven automatic execution we use `intern'
  ;; here to avoid having to escape strings which breaks powershell

  (let ((file (pop argv)) ; file is passed on argv to avoid needing to
        ;; escape double quotes with the nice side effect that it can
        ;; handle file names with a literal double quote

        ;; explicitly set to nil to work around the dos literal local
        ;; variables issue, it seems the `find-file-literally' still
        ;; reads local variables even if it does not set them
        enable-local-variables)
    (find-file-literally file)
    ;; `find-file-literally' avoids org-mode which can take over 500ms
    (end-of-line)
    (when (eq (char-before) ?\^m) ; if crlf line ending detected
      ;; revert buffer to avoid dos literal local variables issue
      (let ((coding-system-for-read 'utf-8))
        (revert-buffer nil t t))))

  (let ((enable-local-eval t) ; when running as a script there
        ;; are other means of preventing arbitrary execution
        (enable-local-variables :all) ; allow all local variables

        ;; fake setting the major mode to org-mode so that
        ;; `org-set-regexp-and-options' will trigger correctly
        (major-mode 'org-mode)

        ;; `find-file-literally' is sticky and will cause issue if
        ;; e.g. a call to `find-file' prompting to revisit normally
        find-file-literally)
    ;; we do not set `enable-local-eval' and `enable-local-variables'
    ;; in the outer let due to a bug in Emacs 26 and 27 inside of
    ;; `find-file-noselect' where `enable-local-eval' is not properly
    ;; shadowed see Emacs a1fd11a28f3c2f4f81163765dd3b53e5ce0b39cf
    (require 'org) ; `org-set-regexps-and-options' is not an autoload
    ;; so we have to require org to get it
    (org-set-regexps-and-options) ; `org-complex-heading-regexp' must be bound
    ;; otherwise orgstrap blocks that use noweb will cause errors

    ;; run the orgstrap block without entering org-mode, this saves lots of time
    (hack-local-variables)))

Old approach

This is an older version of the block that is more verbose and that also does not work on Debian and friends because dash does not support the function keyword. The development workflow is also significantly more annoying and prone to break because the checksums always have to be updated and kept in sync.

function = () { :; }
function silentlycontinue () { :; }

$ErrorActionPreference= "silentlycontinue"
null="/dev/stdout"
__FILE="${0}"
__PARGS=${@}
function posix () {
    test $ZSH_VERSION && { setopt shwordsplit; _IFS=$IFS; unset IFS; }
    emacs --quick --batch --load ~/.emacs.d/orgstrap-developer-checksums.el --load ~/.emacs.d/orgstrap-batch-helper.el --visit "${__FILE}" -- ${__PARGS}
    test $ZSH_VERSION && { unsetopt shwordsplit; IFS=$_IFS; }
}
"posix" > $null
"exit" > $null
$ErrorActionPreference= "Continue"

$org=$MyInvocation.MyCommand.Source
emacs --quick --batch --load ~/.emacs.d/orgstrap-batch-helper.el --visit $org -- $args
exit
<# open powershell comment

Issues

cannot use a mode: org; local variable, it triggers hack-local-variables twice somehow

setting an explicit mode mode: org; in the file causes hack local variables to try to run itself twice recursively.

emacs -q -Q -eval "(let ((file (pop argv))) (find-file-literally file) (hack-local-variables))" "./shebang.org"

Bootstrap

(message "noweb working")
(message "I am an executable Org file!") ; (ref:test)
(message "file name is: %S" buffer-file-name)
(message "file truename is: %S" buffer-file-truename)
(message "argv is: %S" argv)
<<nowhere>>
(unless (featurep 'ow) (load (expand-file-name "ow.el" default-directory)))
(ow-cli-gen
    ((:test))
  (message "running ow-cli-gen block ..."))
(message "post cli-gen")

(test) Make sure coderefs work.

Local Variables