-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allowing dynamically sized mallocs #136
Comments
I thought a little bit more about this and found a problem with the handling of An alternative would be to assume well-initialized programs such that init reads should not be possible (hence we don't need to encode them at all). This is not a nice solution at all though. |
@hernanponcedeleon and me talked about possible solutions before. The easiest we've come up with works as follows.
|
I think I came up with an idea that might solve this problem (and hopefully also #132 and #198). It will also have an impact in #137. As I see it, addresses are needed for As stated in #137, this way of computing alias information is over-restrictive. For the time being, i.e. until #137 is properly solved, I think we can use the solution below to keep our current alias analysis while not relying on knowing all existent addresses. The set of all addresses is used because
With this change, we get rid of the need to know all addresses for (2) and we just need to deal with (1). Instead of having one
we use an UF Coming back to the encoding of
|
I think that misses the problem. Unbounded memory is no problem for the SMT encoding. We have plenty of options there. |
You are right (I should have read the 2nd msg in the issue). However, I think that the solution in the 3rd msg combined with (2) should also removed the need to explicitly allocate every address. Regarding the solution form the 3rd msg ... I think instead of creating one
The problem that we have, is that whenever there is an array or structure access, boogie usually generates two
I think that once we have the
This is just constant propagation, but if we do this after unrolling, a single iteration should be enough. In general this should gives us precise information about all registers except those coming from a We could later have another pass the removes unused code (here the first two instructions). |
Creating an init event per
|
I want to give an update on this since we recently removed the requirement of init events in #624, which were the main problem in supporting dynamically sized mallocs. With those out of the way, let me sketch what needs to be done to add the mallocs. (1) The (2) The (3) The
This encoding can be optimized in two ways. (1) if we order all statically-sized memory objects to come first, we can give them constant addresses (i.e. encode Aspects that need get considered:
|
Fixed by #750 |
Right now, Dartagnan overapproximates the set of all possible addresses but to do so, it uses two assumptions:
Why do we do this? Well, if we have a fixed set of addresses, we can compute for each memory event the set of addresses it could potentially affect. We use this information to obtain alias information. However, the encoding of memory accesses doesn't care at all. We can use arbitrary address expressions for each memory event that evaluate to any arbitrary integer without regarding the address space at all. In other words, we are not tied to a fixed memory space in the encoding but rather in the current alias analysis.
Let's assume that we drop the second assumption, that is, we still have a fixed set of mallocs but each of them is dynamically sized. For each malloc k, we can express the malloced memory region as an interval
[base_k, base_k + size_k]
. Now let us create symbolic addresses for each of the bases and letsize_k
be given by any integer expressionWe can now encode
0 < base_1, base_1 + size_1 < base_2, base_2 + size_2 < base_3 ...
This will guarantee that no matter to what value any of the size expressions evaluate, we have non-overlapping memory.
In the absence of UB, ordering the bases in this manner is also sound.
What shall the alias analysis do now? Well, we need to properly track bases. If we have two operations that access different bases, then they can never alias (assuming no undefined behaviour). If they access the same base, we need to compare the offsets to determine whether they can alias or not. If we cannot find any set of possible bases for some memory access, we can still assume the whole address space. If we could establish that all pointers in the program point to some base (i.e. to some malloced object), we could even improve our alias analysis (if we load a ptr and don't know its value, a consequent load from that ptr-addr has to be a
base
so we can still say that accesses with different offsets cannot alias, even if we do not know their bases). This is actually quite frequently the case, unless we have pointers to inside a struct/array.So if I'm not mistaken, then the only reason we do not have dynamically sized memory is because of our alias analysis. And I believe our alias analyses could be adapted to argue about
base
+offset
(I think it does it already to some extent).The text was updated successfully, but these errors were encountered: