-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory Datatype and Pointer type #550
Conversation
Add .getPrimitiveFields(Type,int) Add .getByteOffset(Type,List) Add memory size to BooleanType
public Formula visit(NullPointer constant) { | ||
return context.useIntegers ? integerFormulaManager().makeNumber(0) : | ||
bitvectorFormulaManager().makeBitvector(64, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use size of the pointer type rather than the hardcoded value 64.
} | ||
// only encode offset expressions if not constant. | ||
// byteCounts already contains constant offset values. | ||
final Formula count = constant != null ? null : offset.accept(this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why this distinction is important. I think the method is a lot more complicated than it needs to be.
Also, seeing all this distinction between integers and bitvectors, I think it might be a good idea to have a NumberEncoder
that encapsulates IntegerFormulaManager
and BitvectorFormulaManager
with a unifying interface to avoid all this code duplication.
int result = value.intValue(); | ||
checkState(BigInteger.valueOf(result).equals(value), "Integer bit length exceeded by %s", value); | ||
return result; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can use value.intValueExact
which automatically throws when the conversion loses information.
// If we reduce MemoryObject as a normal IConst, we loose the fact that it is a Memory Object | ||
// We cannot call reduce for RSHIFT (lack of implementation) | ||
if(!(lhs instanceof MemoryObject) && op != RSHIFT) { | ||
if(op != RSHIFT) { | ||
return expressions.makeBinary(lhs, op, rhs).reduce(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the RSHIFT restriction was because back then we didn't know the bitwidth of the expression, so we didn't know if a sign-bit gets shifted or not. Nowadays, I think this should work fine.
Also, the comment about MemoryObject can be removed.
@@ -160,7 +158,7 @@ public Expression visit(IExprBin iBin) { | |||
// Rule for associativity (rhs is IConst) since we cannot reduce MemoryObjects | |||
// Either op can be +/-, but this does not affect correctness | |||
// e.g. (&mem + x) - y -> &mem + reduced(x - y) | |||
if(lhs instanceof IExprBin lhsBin && lhsBin.getRHS() instanceof IConst && lhsBin.getOp() != RSHIFT) { | |||
if(lhs instanceof IExprBin lhsBin && lhsBin.getRHS() instanceof IValue && lhsBin.getOp() != RSHIFT) { | |||
Expression newLHS = lhsBin.getLHS(); | |||
Expression newRHS = expressions.makeBinary(lhsBin.getRHS(), lhsBin.getOp(), rhs).reduce(); | |||
return expressions.makeBinary(newLHS, op, newRHS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This optimization is wrong, no? The LHS op may be mul, div, etc. all of which would give wrong results.
Even if the LHS is sub, this is wrong: (a - x) - y != a - (x - y)
|
||
import static com.google.common.base.Preconditions.checkArgument; | ||
|
||
public final class PointerCast implements Expression { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is inconsistent that we use a unary integer expression with cast op for signed/unsigned casts but then use a dedicated PointerCast
class for casts from and to pointers.
It would be better to have a single CastExpression
class for all casts. It shouldn't be done in this PR, but we have to keep it in mind.
mem.setCVar(name); | ||
return mem; | ||
} | ||
|
||
public MemoryObject newMemoryObject(String name, int size) { | ||
checkState(!locations.containsKey(name), | ||
"Illegal allocation. Memory object %s is already defined", name); | ||
final MemoryObject mem = program.getMemory().allocate(size, true); | ||
final MemoryObject mem = program.getMemory().allocate(types.getArchType(), size, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can we use archType
here but use byteType
for the virtual allocations?
Also, what is the meaning of archType
now? Is it an integer type whose size matches the size of pointer types?
Expression result = v1.getType() instanceof IntegerType ? | ||
expressions.makeBinary(v1, ctx.opArith().op, v2) : | ||
expressions.makeGetElementPointer(archType, v1, List.of(v2)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks dangerous. There needs to be at least a check that the op is +
when doing GEP.
I'm also not sure if we can silently introduce GEPs/PointerType into Litmus code like this. Maybe it is fine for CLitmus, but in general we need to be careful.
A canonical way to handle litmus code while requiring pointer types for e.g. memory accesses, is to put a ptr2int cast when usingMemoryObject
and int2ptr
when doing a memory access. In that way, the litmus code will think everything is an integer.
? (expr -> expr instanceof IValue || expr instanceof MemoryObject || expr instanceof Function || | ||
expr instanceof BConst || expr instanceof Register) | ||
: (expr -> expr instanceof IValue || expr instanceof MemoryObject || expr instanceof Function || | ||
expr instanceof BConst); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can refactor the common check into a isConstant
method.
With how big this has gotten again, and the number of issues, I will close this PR. |
Adds datatypes to memory objects, as well as the opaque pointer type to the IR. Removes the preprocessing step that replaces GEP expressions with simple additions.
To infer the datatype of dynamically-allocated objects, the program is scanned for occurrences of that address. If those occurrences do not agree on one type, it defaults to byte. The current implementation is unsound, as it ignores registers and the base address being stored in memory, where another reader may assume another type. The datatype currently controls initializations.
The pointer type will allow the encoder to choose a different theory for address expressions. The IR will be closer to the LLVM IR and should become more readable.
More:
IConst
anymore, the interface is removed entirely.