Skip to content

Commit

Permalink
AArch64: Enhance the effectiveness of the register cache
Browse files Browse the repository at this point in the history
365224b introduced a cache to keep track of BEAM registers
that had already been loaded into CPU registers.

This commit further improves the effectiveness of the cache. Now
21,109 loads of BEAM registers are avoided, compared to 15,098 loads
before this commit. (When loading all modules in OTP and the standard
library of Elixir 16.0.0.)

Of those load instructions, 8,975 instructions were no longer needed
because the contents of the BEAM register was already present in the
desired CPU register. Before this commit, only 1,861 load instructions
were eliminated.

The remaining load instructions are replaced with a `mov`
instruction (from register to register), which is more efficient than
a load instruction but does not reduce the code size.

As an example, the following BEAM code:

    {test,is_nonempty_list,{f,3},[{y,0}]}.
    {get_hd,{y,0},{x,0}}.

would be translated to native code like so:

    # is_nonempty_list_fS
        ldr x8, [x20]
        tbnz x8, 1, @label_3-1
    # get_hd_Sd
        ldr x8, [x20]
        ldur x25, [x8, -1]

That is, `{y,0}` would be loaded into a CPU register twice.

The improved caching avoids reloading `{y,0}` in the `get_hd`
instruction:

    # is_nonempty_list_fS
        ldr x8, [x20]
        tbnz x8, 1, @label_3-1
    # get_hd_Sd
    # skipped fetching of BEAM register
        ldur x25, [x8, -1]
  • Loading branch information
bjorng committed Jan 22, 2024
1 parent 9b61354 commit 2bd6230
Show file tree
Hide file tree
Showing 8 changed files with 342 additions and 183 deletions.
309 changes: 215 additions & 94 deletions erts/emulator/beam/jit/arm/beam_asm.hpp

Large diffs are not rendered by default.

10 changes: 5 additions & 5 deletions erts/emulator/beam/jit/arm/instr_arith.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,7 @@ void BeamModuleAssembler::emit_i_plus(const ArgLabel &Fail,
if (always_small(LHS) && always_small(RHS) && is_small_result) {
auto dst = init_destination(Dst, ARG1);
if (rhs_is_arm_literal) {
auto lhs = load_source(LHS, ARG2);
auto lhs = load_source(LHS);
Uint cleared_tag = RHS.as<ArgSmall>().get() & ~_TAG_IMMED1_MASK;
comment("add small constant without overflow check");
a.add(dst.reg, lhs.reg, imm(cleared_tag));
Expand Down Expand Up @@ -341,7 +341,7 @@ void BeamModuleAssembler::emit_i_minus(const ArgLabel &Fail,
if (always_small(LHS) && always_small(RHS) && is_small_result) {
auto dst = init_destination(Dst, ARG1);
if (rhs_is_arm_literal) {
auto lhs = load_source(LHS, ARG2);
auto lhs = load_source(LHS);
Uint cleared_tag = RHS.as<ArgSmall>().get() & ~_TAG_IMMED1_MASK;
comment("subtract small constant without overflow check");
a.sub(dst.reg, lhs.reg, imm(cleared_tag));
Expand Down Expand Up @@ -1175,7 +1175,7 @@ void BeamModuleAssembler::emit_i_band(const ArgLabel &Fail,
&ignore)) {
comment("skipped test for small operands since they are always "
"small");
auto lhs = load_source(LHS, ARG2);
auto lhs = load_source(LHS);
auto dst = init_destination(Dst, ARG1);

/* TAG & TAG = TAG, so we don't need to tag it again. */
Expand Down Expand Up @@ -1269,7 +1269,7 @@ void BeamModuleAssembler::emit_i_bor(const ArgLabel &Fail,
if (a64::Utils::encodeLogicalImm(rhs, 64, &ignore)) {
comment("skipped test for small operands since they are always "
"small");
auto lhs = load_source(LHS, ARG2);
auto lhs = load_source(LHS);
auto dst = init_destination(Dst, ARG1);

a.orr(dst.reg, lhs.reg, rhs);
Expand Down Expand Up @@ -1624,7 +1624,7 @@ void BeamModuleAssembler::emit_i_bsl(const ArgLabel &Fail,
if (is_bsl_small(LHS, RHS)) {
comment("skipped tests because operands and result are always small");
if (RHS.isSmall()) {
auto lhs = load_source(LHS, ARG2);
auto lhs = load_source(LHS);
a.and_(TMP1, lhs.reg, imm(~_TAG_IMMED1_MASK));
a.lsl(TMP1, TMP1, imm(RHS.as<ArgSmall>().getSigned()));
} else {
Expand Down
4 changes: 2 additions & 2 deletions erts/emulator/beam/jit/arm/instr_bif.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ void BeamModuleAssembler::emit_i_bif1(const ArgSource &Src1,
const ArgLabel &Fail,
const ArgWord &Bif,
const ArgRegister &Dst) {
auto src1 = load_source(Src1, TMP1);
auto src1 = load_source(Src1);

a.str(src1.reg, getXRef(0));

Expand Down Expand Up @@ -169,7 +169,7 @@ void BeamModuleAssembler::emit_i_bif(const ArgLabel &Fail,
void BeamModuleAssembler::emit_nofail_bif1(const ArgSource &Src1,
const ArgWord &Bif,
const ArgRegister &Dst) {
auto src1 = load_source(Src1, TMP1);
auto src1 = load_source(Src1);

a.str(src1.reg, getXRef(0));

Expand Down
2 changes: 1 addition & 1 deletion erts/emulator/beam/jit/arm/instr_bs.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -632,7 +632,7 @@ void BeamModuleAssembler::emit_i_bs_match_string(const ArgRegister &Ctx,
void BeamModuleAssembler::emit_i_bs_get_position(const ArgRegister &Ctx,
const ArgRegister &Dst) {
const int start_offset = offsetof(ErlSubBits, start);
auto ctx_reg = load_source(Ctx, TMP1);
auto ctx_reg = load_source(Ctx);
auto dst_reg = init_destination(Dst, TMP2);

/* Match contexts can never be literals, so we can skip clearing literal
Expand Down
Loading

0 comments on commit 2bd6230

Please sign in to comment.