Skip to content
This repository has been archived by the owner on Feb 4, 2020. It is now read-only.

Incremental builds with MSBuild/Visual Studio are broken #227

Open
Un1oR opened this issue Sep 6, 2016 · 29 comments
Open

Incremental builds with MSBuild/Visual Studio are broken #227

Un1oR opened this issue Sep 6, 2016 · 29 comments
Labels

Comments

@Un1oR
Copy link

Un1oR commented Sep 6, 2016

I think it problem related with #33, but a bit more complicated.
When I try incremental build, all solution fully rebuilds. I found it not happens if I comment body of PersistentJSONDict.save function. As I understand, MSBuild tracks writing a statistics file and mark projects in solution as outdated.
Affects only solutions with two or more projects with build dependencies between it!

@frerich
Copy link
Owner

frerich commented Sep 7, 2016

Can you provide a sample project which triggers this? On first sight, it sounds like another incarnation of https://github.com/frerich/clcache/wiki/Caveats#no-cache-hits-when-building-via-visual-studio-ide-or-msbuild -- i.e. the file tracking feature needs to be disabled. Does that help with incremental builds, too? If so, I guess that Wiki page should be extended.

@Un1oR
Copy link
Author

Un1oR commented Sep 7, 2016

Disabling tracking is incompatible with incremental builds, as far I'm understand.
I'll add my test project a little bit later.

@Un1oR
Copy link
Author

Un1oR commented Sep 7, 2016

https://github.com/Un1oR/clcache-incremental-test
Interesting thing I found. It affects only hot cache case. If I run build.bat first three times I haven't rebuilds at result. But if I remove build directories and run it againt, I will always have full rebuild.
My log https://gist.github.com/Un1oR/81062c4a6e04ca3ec072ff986e0718e8.

@Un1oR
Copy link
Author

Un1oR commented Sep 8, 2016

In addition to my previous comment. If I comment registerCacheHit function body a problem goes away.

@frerich
Copy link
Owner

frerich commented Sep 9, 2016

I suspect what happens is that due to the file tracing feature of Visual Studio, it notices that for any .cpp file being compiled, the stats.txt file is touched as well. So it considers that a 'dependency' of some sort: if a.cpp is touched, stats.txt needs to get touched -- and maybe Visual Studio also has a backling such that it thinks that if stats.txt is touched, a.cpp needs to be updated, too.

Thus, when doing an incremental build which merely compiles single .cpp files, it notices that doing so touches stats.txt (to count a cache hit) and thus it thinks that all other .cpp files will need to be recompiled as well? Sounnds unlikely...

I skimmed the documentation about Incremental Builds, but didn't spot anything which might help with avoiding this issue. My feeling is that if we just had a way to either disable tracking writes to stats.txt, this issue would be gone.

@Un1oR
Copy link
Author

Un1oR commented Sep 14, 2016

I have simplified example. Now it contains one project with two cpp files and problem still is reproduced.

I have tried to find workaround for disabling statistics file tracking. ExcludedInputPaths and TrackedOutputFilesToIgnore options wasn't have effect for me. Put cache directory into one of this folders is works, but it isn't useful solution, I think.

@frerich
Copy link
Owner

frerich commented Sep 14, 2016

TrackedOutputFilesToIgnore sounds promising! I didn't see it mentioned too often though. It is listed in the Microsoft.Build.CommonTypes.xsd schema, but I see no mentioning of the term TrackedOutputFilesToIgnore in the C# source code of MSBuild. :-/

@frerich
Copy link
Owner

frerich commented Sep 19, 2016

I think we should really start looking into integration tests which use clcache the same way Visual Studio does, maybe using msbuild instead of calling cl.exe triggers this already?

@ivaigult
Copy link

I'm observing the same behavior and even worse: compiled binaries don't contain my changes, but full rebuild helps.

@frerich frerich added the bug label Nov 25, 2016
@frerich
Copy link
Owner

frerich commented Nov 25, 2016

Alas I don't use msbuild myself -- is it possible to detect incremental builds by looking at the cl command line? If so, we could at least treat such invocations as unsupported and don't try to restore any cached objects.

@ivaigult
Copy link

The command line is exactly same as for regular build, but I cannot reproduce this without Visual Studio or msbuild.

@izmmisha
Copy link

Would be nice if this problem solved by #319 PR.

@Un1oR
Copy link
Author

Un1oR commented Jul 26, 2018

I'll try to check it out soon!

@izmmisha
Copy link

izmmisha commented Aug 7, 2018

@RomanYudintsev yeah, I saw such errors before.

cl : Command line warning D9002: ignoring unknown option '--multiprocessing-fork' [D:\test\clcache-incremental-test\app
\app.vcxproj]
cl : Command line warning D9024: unrecognized source file type 'parent_pid=9860', object file assumed [D:\test\clcache-
incremental-test\app\app.vcxproj]
cl : Command line warning D9024: unrecognized source file type 'pipe_handle=492', object file assumed [D:\test\clcache-
incremental-test\app\app.vcxproj]

This problem related to freezing of app. There is multiprocessing.freeze_support() to solve this problem, but looks like it not works in your case.

I'm using PyInstaller to create exe and it works. How you generate exe? Could you please show output of this process, probably it can show any error with multiprocessing module.

@izmmisha
Copy link

izmmisha commented Aug 7, 2018

@RomanYudintsev also incremental build will not work with clcache_branch.exe executable, tracker will create clcache_branch*.tlog files, but msbuild will read only cl.*.tlog files, so you need to rename it to cl.exe or cl.cache_branch.exe so cl.*.exe

@RomanYudintsev
Copy link

build log for my clcache_branch.exe - https://gist.github.com/RomanYudintsev/4cd8657a0d682839bbceef128e9624b0
I'll try rename file

@izmmisha
Copy link

izmmisha commented Aug 7, 2018

Ok, I see that I'm building exe in different way (I'm new to this project), should be fixed at izmmisha@137a3e2

Thank you @RomanYudintsev

@RomanYudintsev
Copy link

It's works! and not full rebuild :)
Thanks @izmmisha

@izmmisha
Copy link

izmmisha commented Aug 9, 2018

After some time of inspecting msbuild I found another things.
v120 toolchain (VS2013) use hardcoded tlog names

    protected override string[] ReadTLogNames
    {
      get
      {
        return new string[2]
        {
          "cl.read.1.tlog",
          "cl.*.read.1.tlog"
        };
      }
    }

v140 toolchain use CLToolExe for filenames

    protected override string[] ReadTLogNames
    {
      get
      {
        string withoutExtension = Path.GetFileNameWithoutExtension(this.get_ToolExe());
        return new string[3]
        {
          withoutExtension + ".read.*.tlog",
          withoutExtension + ".*.read.*.tlog",
          withoutExtension + "-*.read.*.tlog"
        };
      }
    }

So @RomanYudintsev I see that your linker version is 14, probably you are using v140 toolchain.
With v140 toolchain we don't need to rename clcache to cl.*, but it inspecting logs from all threads (not only main thread like v120). This mean that proposed solution (at #319 PR) with moving read stats.txt from main thread will not work.

But as mentioned earlier msbuild have some properties to set ignore files:

TrackedOutputFilesToIgnore sounds promising! I didn't see it mentioned too often though. It is listed in the Microsoft.Build.CommonTypes.xsd schema, but I see no mentioning of the term TrackedOutputFilesToIgnore in the C# source code of MSBuild. :-/

In case of stats.txt and manifests it is possible to setup TrackedInputFilesToIgnore using NoDependencies property, it can be set via msbuild command line option /p:NoDependencies=c:\path\to\clcache\* or via environment variable set NoDependencies=c:\path\to\clcache\*.

So probably with v140 toolchain you can use NoDependencies without clcache modification at all.
v120 toolchain require to read of source files from main thread.

@izmmisha
Copy link

izmmisha commented Aug 14, 2018

Ok, finally I found correct way how to make it work.
FileTracker.dll exports API to suspend/resume tracking operations.
PR #319 updated. It should works with v120, v140 toolchains.
But it still necessary to froze clcache to cl.exe when v120 toolchain used. I don't think that this is a problem, just use ClToolPath=c:\path\to\clcache\ instead ClToolExe=clcache.exe
v140 toolchain works with clcache.exe

I think we should really start looking into integration tests which use clcache the same way Visual Studio does

@frerich, Added some tests at 408d609 653ac12 (fixed v140 tests)

@pthom
Copy link

pthom commented Nov 19, 2018

@izmmisha : I tested your fork on a large project (600 000 LoC, 173 projects).

It works almost perfectly! Thanks a lot! The compilation time dropped from 30 minutes to 3 minutes, and incremental builds now work flawlessly under MSVC 2015 and 2017 (and it integrates well with Qt moc).

I wrote a script to simplify the integration with Visual Studio : you can find it here:

However I have one small issue : sometimes one invocation of cl.exewill fail with the message

29>  PermissionError: [WinError 5] Access is denied: 'C:\\Users\\pascal\\clcache\\stats.txt.new' -> 'C:\\Users\\pascal\\clcache\\stats.txt'

This is extremey rare : the probability is about 0.1%, so it happens only with large projects with a large number of hits. Using clcache-server does not solve it.

However when I encounter, I just need to do a second build, and it always work (and build only the missing files).

Do you have an idea on how to solve this problem?

Here is the full trace when I have this issue:

29>  C:\Python36-32\Scripts\clcache.py Reusing cached object for key e6d9192be7989a90ce13033623593150 for object file ExternalCppUtils_GTest.dir\Release\CircularAccess_GTest.obj
29>  C:\Python36-32\Scripts\clcache.py Finished. Exit code 0
29>  concurrent.futures.process._RemoteTraceback:
29>  """
29>  Traceback (most recent call last):
29>    File "c:\python36-32\lib\concurrent\futures\process.py", line 175, in _process_worker
29>      r = call_item.fn(*call_item.args, **call_item.kwargs)
29>    File "c:\python36-32\lib\site-packages\clcache\__main__.py", line 1725, in processSingleSource
29>      return processDirect(cache, objectFile, compiler, cmdLine, sourceFile)
29>    File "c:\python36-32\lib\site-packages\clcache\__main__.py", line 1755, in processDirect
29>      return processCacheHit(cache, objectFile, cachekey)
29>    File "c:\python36-32\lib\site-packages\clcache\__main__.py", line 1537, in processCacheHit
29>      stats.registerCacheHit()
29>    File "c:\python36-32\lib\site-packages\clcache\__main__.py", line 153, in untrackedFunc
29>      return func(*args, **kwargs)
29>    File "c:\python36-32\lib\site-packages\clcache\__main__.py", line 795, in __exit__
29>      self._stats.save()
29>    File "c:\python36-32\lib\site-packages\clcache\__main__.py", line 703, in save
29>      json.dump(self._dict, f, sort_keys=True, indent=4)
29>    File "c:\python36-32\lib\contextlib.py", line 88, in __exit__
29>      next(self.gen)
29>    File "c:\python36-32\lib\site-packages\clcache\__main__.py", line 162, in atomicWrite
29>      os.replace(tempFileName, fileName)
29>  PermissionError: [WinError 5] Access is denied: 'C:\\Users\\pascal\\clcache\\stats.txt.new' -> 'C:\\Users\\pascal\\clcache\\stats.txt'
29>  """
29>
29>  The above exception was the direct cause of the following exception:
29>
29>  Traceback (most recent call last):
29>    File "C:\Python36-32\Scripts\clcache-script.py", line 11, in <module>
29>      load_entry_point('clcache==4.1.1.dev65+gc4a6f5e', 'console_scripts', 'clcache')()
29>    File "c:\python36-32\lib\site-packages\clcache\__main__.py", line 1619, in main
29>      return processCompileRequest(cache, compiler, sys.argv)
29>    File "c:\python36-32\lib\site-packages\clcache\__main__.py", line 1646, in processCompileRequest
29>      return scheduleJobs(cache, compiler, cmdLine, environment, sourceFiles, objectFiles)
29>    File "c:\python36-32\lib\site-packages\clcache\__main__.py", line 1704, in scheduleJobs
29>      exitCode, out, err, doCleanup = future.result()
29>    File "c:\python36-32\lib\concurrent\futures\_base.py", line 425, in result
29>      return self.__get_result()
29>    File "c:\python36-32\lib\concurrent\futures\_base.py", line 384, in __get_result
29>      raise self._exception
29>  PermissionError: [WinError None] Access is denied: 'C:\\Users\\pascal\\clcache\\stats.txt.new' -> 'C:\\Users\\pascal\\clcache\\stats.txt'

@izmmisha
Copy link

It works almost perfectly! Thanks a lot!

Glad to hear, thanks.

Do you have an idea on how to solve this problem?

I had such problem on my side. It related to windows explorer, you are keep open window with C:\Users\pascal\clcache\ folder. I don't know how to solve it, except close explorer... (probably any other software which can watch folder content can cause such problem too)

@pthom
Copy link

pthom commented Nov 19, 2018

Thanks for your answer! Actually I had no explorer window opened on this folder.

So, I posted a workaround for the stats.txt issue : it is here
pthom@78a4630

On my side, it works perfectly.

Do you want to add this patch to your PR? Or do you prefer that I post a separate PR?

Thanks!

@izmmisha
Copy link

Do you want to add this patch to your PR? Or do you prefer that I post a separate PR?

Well, would be better to identify the root of problem, so probably it introduced when used ProcessPoolExecutor, but I have only MSVS 2013 toolchain and it is necessary to make it work.

But if you are able to reproduce this problem on your side and it looks like you are using MSVS 2017 (or 2015), you can try revert this commit 1aebba3 then we will know is ProcessPoolExecutor related to this problem or not.

@pthom
Copy link

pthom commented Nov 19, 2018

Hi, I tried to revert your commit (and my modif), and the problem is still present.
Your commit is probably out of cause and I think this problem is older than your PR.

I suspect the issue revolves around the locking mechanism inside the Statistics class.
It instantiates a lock. However as far as I understand, it is never used.

So I tried to use the lock inside the __exit__ method as below :

    @untrackable
    def __exit__(self, typ, value, traceback):
        with self.lock:
            # Does not write to disc when unchanged
            self._stats.save()

But it did not solve the problem either.

I think the issue is difficult to solve by code :

See the class CacheLock : it tries to group a standard mutex together with a global windows mutex. This is quite tricky, and might fail sometimes.

class CacheLock:
    def __init__(self, mutexName, timeoutMs):
        self._mutexName = 'Local\\' + mutexName
        self._mutex = None
        self._timeoutMs = timeoutMs
    def acquire(self):
        if not self._mutex:
            self.createMutex()
        result = windll.kernel32.WaitForSingleObject(
            self._mutex, wintypes.INT(self._timeoutMs))
        if result not in [0, self.WAIT_ABANDONED_CODE]:
            if result == self.WAIT_TIMEOUT_CODE:

@pthom
Copy link

pthom commented Nov 21, 2018

On a side note, I have discovered that this PR only works with python 32 bits.

This is due to the fact that MSVC and msbuild is itself a 32 bits program, and this PR needs to communicate with msbuild via a 32 bits dll (FileTracker.dll)
see commit: 2e13e11

@izmmisha
Copy link

@pthom Well, I did some tests using this simple script:

import concurrent.futures
import tempfile
import os

#os.environ['TRACKER_ENABLED']="1"

from clcache.__main__ import (
    Statistics,
)

#STATS_FILE = 'C:\\Users\\im\\clcache\\stats.txt' #problem
#STATS_FILE = tempfile.NamedTemporaryFile(dir='C:\\Users\\im\\clcache', suffix='.txt').name #problem
#STATS_FILE = tempfile.NamedTemporaryFile(suffix='.txt').name #no problem
#STATS_FILE = tempfile.NamedTemporaryFile(dir='C:\\Users\\im\\clcache').name #problem without txt extension too
STATS_FILE = 'C:\\clcache\\stats.txt' #no problem

def change_stats():
    stats = Statistics(STATS_FILE)
    with stats.lock, stats as s:
        s.registerCallWithInvalidArgument()


if __name__ == '__main__':
    print(STATS_FILE)
    while True:
        with concurrent.futures.ThreadPoolExecutor(4) as executor:
            futures = [executor.submit(change_stats) for _ in range(1000)]
            for future in concurrent.futures.as_completed(futures):
                future.result()

So, I'm able to reproduce this problem with ThreadPoolExecutor and without untracker's decorators:

C:\Users\im\clcache\tmpp4e3tv5w
Traceback (most recent call last):
  File "stattest.py", line 28, in <module>
    future.result()
  File "c:\Python34\lib\concurrent\futures\_base.py", line 395, in result
    return self.__get_result()
  File "c:\Python34\lib\concurrent\futures\_base.py", line 354, in __get_result
    raise self._exception
  File "c:\Python34\lib\concurrent\futures\thread.py", line 54, in run
    result = self.fn(*self.args, **self.kwargs)
  File "stattest.py", line 19, in change_stats
    s.registerCallWithInvalidArgument()
  File "c:\Python34\lib\site-packages\clcache\__main__.py", line 809, in __exit__
    self._stats.save()
  File "c:\Python34\lib\site-packages\clcache\__main__.py", line 719, in save
    json.dump(self._dict, f, sort_keys=True, indent=4)
  File "c:\Python34\lib\contextlib.py", line 66, in __exit__
    next(self.gen)
  File "c:\Python34\lib\site-packages\clcache\__main__.py", line 162, in atomicWrite
    os.replace(tempFileName, fileName)
PermissionError: [WinError 5] Отказано в доступе: 'C:\\Users\\im\\clcache\\tmpp4e3tv5w.new' -> 'C:\\Users\\im\\clcache\\tmpp4e3tv5w'

The thing I get from this test is that I'm able to easily reproduce such problem only when file located at %USERPROFILE%\clcache, if it located at %USERPROFILE%\AppData\Local\Temp there is no error, if it located at C:\clcache\ also no problem (and as you mentioned before windows explorer is not related to this).

So probably this problem should be added to caveat section of wiki.

@9a4gl
Copy link

9a4gl commented Jan 8, 2019

On a side note, I have discovered that this PR only works with python 32 bits.

This is due to the fact that MSVC and msbuild is itself a 32 bits program, and this PR needs to communicate with msbuild via a 32 bits dll (FileTracker.dll)
see commit: 2e13e11

I am using x86 toolchains and x64 python, and this commit helped in my case:
9a4gl@dd15bd8

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

7 participants