Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add API set information to functions where applicable #1929

Closed
ChrisDenton opened this issue Jun 19, 2024 · 13 comments
Closed

Add API set information to functions where applicable #1929

ChrisDenton opened this issue Jun 19, 2024 · 13 comments

Comments

@ChrisDenton
Copy link
Contributor

I think it might be useful for the metadata to contain information on API sets on relevant functions. E.g.:
[ApiSet(api-ms-win-shell-shellfolders-l1-1-0)] or a similar attribute.

Some people have requested that Rust's standard library use API sets rather than the "old" DLLs for a variety of reasons:

  1. API sets are always loaded from a system DLL, even if they're not KnownDLLs. This can matter for some types of applications (e.g. standalone applications that may be run from the user's Downloads folder) that want secure loading of DLLs. There are other ways to mitigate that but using API sets has the advantage of having inherently more secure DLL loading.
  2. It allows loading fewer DLLs. Some API sets redirect to DLLs that are more tightly scoped but aren't necessarily stable between Windows versions/skus.
  3. The delay load linker feature works on DLLs so being more fine grained can be useful. E.g. using delay load on kernel32 is often not that useful.

I am currently exploring the options here and not committing to anything yet but I think either way it would be nice if metadata had some more data, especially given that there is a lack of up to date documentation on API sets.

@riverar
Copy link
Collaborator

riverar commented Jun 19, 2024

Not sure where we would get authoritative apiset information from. The SDK, docs, and libs don't even have the right information. 🤔

Hmm...

@ChrisDenton
Copy link
Contributor Author

Well #1928 suggests that's not an issue unique to apisets 😛

But I'd suggest getting it from import libs would be the best option. If they're wrong then that's an issue that needs fixing on their end.

@kennykerr
Copy link
Contributor

  • We have some internal requests for this as well, so that we can build Rust binaries for stripped down versions of Windows that don't necessarily have traditional DLLs like kernel32.
  • We have some internal apiset.xml files that provide the mapping. They're meant to be used to generate umbrella libs like onecoreuap.lib but perhaps those libs don't quite get the right information.
  • I'd be happy to just switch to this rather than keeping two sets of library names for each function.

cc @dpaoliello

@riverar riverar changed the title Add API set information to funcations where applicable Add API set information to functions where applicable Jun 19, 2024
@kennykerr
Copy link
Contributor

I've avoided apisets for a long time because I couldn't see how they benefited developers directly. Recently I received a few different requests to look into them and so I finally had a good conversation with the engineers in Windows responsible for apisets, reverse forwarders, umbrella libs, and all the rest. After much back and forth it was finally concluded that indeed there is no direct value to the developer. It's just not the right abstraction or way to express library dependencies in Windows. They serve an important role but not one that we should interact with directly.

There are some APIs that only live in an apiset because that's the only library name we've been given and in such cases that's just what we use and that's fine. But there is no benefit in trying to get to the apiset behind a "legacy" library name like user32.dll, because well, the apiset in many cases just redirects back to the "legacy" DLL anyway and if it were ever moved, as is sometimes the case with APIs in kernel32.dll for example, they'll have forwarders in place to fix it up. There are even old APIs like SetCursorPos that show up in six different apisets and nobody can tell me which to use. There are also rumblings of the apiset names changing in future making it even more problematic.

The best advice we can give is to stick with the traditional library names like user32.dll, kernel32.dll, and so on.

@ChrisDenton
Copy link
Contributor Author

Thanks Kenny! I'm more than happy to go with the expert opinion on this.

The only two concrete DLLs that were mentioned to me where apisets do make a difference are kernelbase.dll (instead of kerenl32.dll) and windows.storage.dll (instead of shell32.dll).

I'll close this issue now as it doesn't seem useful to explore this any further but I don't might reopening should something change in the future.

@ChrisDenton ChrisDenton closed this as not planned Won't fix, can't repro, duplicate, stale Jun 22, 2024
@kennykerr
Copy link
Contributor

And there will undoubtedly be some cases where we need to use a more appropriate library name and that's certainly fine to consider on a case-by-case basis.

@kennykerr
Copy link
Contributor

The main bit of work here is to be more consistent. Right now I gather we're scraping the umbrella libs like onecoreuap but those libs have a mishmash of traditional and apiset library names. We should at least be consistent and stick with the traditional library names where applicable.

GetStagedPackageOrigin for example is reported to come from api-ms-win-appmodel-runtime-l1-1-1.dll but it should in fact be kernelbase.dll as per the docs:

https://learn.microsoft.com/en-us/windows/win32/api/appmodel/nf-appmodel-getstagedpackageorigin

In fact, the internal database (WCD) where all of this is tracked states that Kernelbase.dll is the "preferred module".

@ChrisDenton
Copy link
Contributor Author

Scanning the import libs in the Windows SDK (10.0.26100.0) it seems that GetStagedPackageOrigin is only ever used with the apiset dll name. So this sounds like it needs a coordinated fix so that both metadata and import libs agree?

@kennykerr
Copy link
Contributor

Ideally the umbrella libs would just use the conventional library names as well.

@talagrand
Copy link

I'll echo Chris's original value proposition. The conventional names (kernel32.dll, ole32.dll, etc) bring in additional functionality on top of the core carve-outs (kernelbase.dll, combase.dll) that it may be desirable to scope out of a process.
In my case, over the long weekend, I was experimenting with getting a binary built with windows-rs running under an AppContainer with the win32k process mitigation enabled. You can play along with LaunchAppContainer.exe here: https://github.com/microsoft/SandboxSecurityTools

This is not currently possible, because my process links to ole32.dll (CoInitializeEx), which links to user32.dll, which requires win32k functionality. But all functionality that I need is actually hosted in combase.dll, addressable via APISets, which does not have this dependency - so the conventional name is standing in the way of dropping the win32k dependency. combase.dll isn't documented, so it would seem that APISets would be the preferred binding there. Directly targeting kernelbase.dll/combase.dll could be viable, but without firm precedent, though as others have commented, the import bindings generally are not well-documented.
I'm happy to open a new issue but it seems this is the most direct solution to this problem.

@talagrand
Copy link

If anyone else stumbles across this, here's a really gnarly workaround to force apiset linkage: https://github.com/talagrand/rust-apisets

@riverar
Copy link
Collaborator

riverar commented Dec 9, 2024

Throwing out several other tips/ideas:

If you know that all your user32.dll imports should go to api-set-foo-1-2-3.dll, you could also add /DELAYLOAD:user32.dll and provide your own _delay_load_helper that handles the redirect.

Alternatively, if you can stabilize the linker lib order, you can provide your own .lib later that overrides anything imported from the windows crate.

@ChrisDenton
Copy link
Contributor Author

A less gnarly workaround for the MSVC toolchain is to use the LINK environment variable. A lib set there will be seen by the linker before any other command line argument so it should be evaluated before anything set later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants