-
Hi, I want to get the offset of a member in a C struct. This is the code I'm using: from elftools.dwarf.die import DIE
def member_offset(die: DIE) -> int | None:
assert die.tag == "DW_TAG_member"
data_member_location = die.attributes.get("DW_AT_data_member_location")
if data_member_location is None:
# bitfield may not have a data_member_location
return None
if type(data_member_location.value) == int:
# elf produced by linux-x64-gcc falls in this
return data_member_location.value
print(f"Unknown data_member_location value={data_member_location.value} die={die}")
return None For this armcl produced elf (DWARF3) test.out.zip, the data_member_location.value is a ListContainer of value
I want to ask why is the value a ListContainer and what does its first element 35 mean. Thanks very much. My test code: // c code for the elf
struct S {
char char_member;
int int_member;
} S_var;
int main() { return S_var.int_member; } from elftools.elf.elffile import ELFFile
from elftools.dwarf.die import DIE
import sys
def member_offset(die: DIE) -> int | None:
assert die.tag == "DW_TAG_member"
data_member_location = die.attributes.get("DW_AT_data_member_location")
if data_member_location is None:
# bitfield may not have a data_member_location
return None
if type(data_member_location.value) == int:
# elf produced by linux-x64-gcc falls in this
return data_member_location.value
print(f"Unknown data_member_location value={data_member_location.value} die={die}")
return None
def main(filename: str, var_name: str):
with open(filename, "rb") as f:
e = ELFFile(f)
if not e.has_dwarf_info():
print("No DWARF info")
sys.exit(1)
d = e.get_dwarf_info()
for cu in d.iter_CUs():
die = cu.get_top_DIE()
var = find_variable(die, var_name)
if var is not None:
break
if var is None:
print(f"Variable {var_name} not found")
sys.exit(1)
var_type = get_DW_AT_type(var)
process_top_type(var_type)
def find_variable(die: DIE, var_name: str):
if die.tag == "DW_TAG_variable":
name = die.attributes.get("DW_AT_name")
if name is not None and name.value == bytes(var_name, "ascii"):
return die
for c in die.iter_children():
if find_variable(c, var_name):
return c
return None
def get_DW_AT_type(die: DIE) -> DIE:
return die.get_DIE_from_attribute("DW_AT_type")
def get_DW_AT_name(die: DIE) -> str:
return str(die.attributes["DW_AT_name"].value, "ascii")
def resolve_typedef(die: DIE) -> DIE:
while die.tag == "DW_TAG_typedef":
die = get_DW_AT_type(die)
return die
def process_top_type(original_type: DIE):
resolved_type = resolve_typedef(original_type)
if resolved_type.tag != "DW_TAG_structure_type":
print("top type is not a struct")
sys.exit(1)
for member in resolved_type.iter_children():
name = get_DW_AT_name(member)
offset = member_offset(member)
print(f"offsetof({name})={offset}")
main("test.out", "S_var") |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
I've figured out a bit about the ListContainer's meaning. It's a DWARF expression, the first element 35 is DW_OP_plus_uconst, the rest of the list is a ULEB128 encoded byte stream. I've worked out a workaround to extract the value. How should I do this cleanly? from elftools.dwarf.die import DIE
from elftools.construct.lib import ListContainer
from elftools.common.construct_utils import ULEB128
from io import BytesIO
def member_offset(die: DIE) -> int | None:
assert die.tag == "DW_TAG_member"
data_member_location = die.attributes.get("DW_AT_data_member_location")
if data_member_location is None:
# bitfield may not have a data_member_location
return None
value = data_member_location.value
if type(value) == int:
# elf produced by linux-x64-gcc falls in this
return value
if type(value) == ListContainer:
DW_OP_plus_uconst = 35
if len(value) < 2 or value[0] != DW_OP_plus_uconst:
print(f"Not implemented, value={value}")
return None
b = BytesIO(bytes(value[1:]))
return ULEB128("")._parse(b, None)
print(f"Unknown data_member_location value={value} die={die}")
return None |
Beta Was this translation helpful? Give feedback.
-
The first thing to do will be parsing the expression blob into operations and arguments:
You can cache the parser object; it's stateless. One instance per CU if you are being meticulous. Then you get a parsed expression, which is a list of namedtuples with Then the condition can go like this:
You might want to run this logic over all structs in your project and see what other DWARF expressions might the compiler generate in this context. A general purpose DWARF expression evaluation, though, is a rather hairy task that can potentially reference registers, memory contents, and even execution history. I don't have a ready made Pythonic DWARF expression interpreter. That said, I won't expect a reasonable C compiler to emit crazy expressions for something like struct member location. One more thing: for determining if the value is an expression I'd rather suggest looking at its DWARF form: |
Beta Was this translation helpful? Give feedback.
The first thing to do will be parsing the expression blob into operations and arguments:
You can cache the parser object; it's stateless. One instance per CU if you are being meticulous.
Then you get a parsed expression, which is a list of namedtuples with
op
(numeric),op_name
(string, the DWARF mnemonic), andargs
(the variable length tuple with whatever arguments the operation takes).Then the condition can go like this:
You …