-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switching to custom DDraw texture rendering #11
Comments
I would start simple. Don't bother wrapping DDRAW.dll. I already made an ASI loader that gets you into process space. Just create an ASI Plugin, find the vtable of the IDirectDrawSurface, hook its Blt method (i.e. the 5th entry in the table). In hook procedure get the 4 byte watermark of the source and replace the source image accordingly. The target surfaces should be already RGB24. Just make sure to have DDRAW.dll in your import table (dynamically loading it using LoadLibrary is not good when in loader lock of DllMain). This should allow you to blt images without palette limitations. Another aspect is the team colors. We will probably need every unit in 8 versions each with a different team color. The team colors the game uses are documented here. The team colored sprites have an annoying side effect: we have to read the team palette to determine which version we must draw.
I would probably put all bitmaps in a large file and then memory map it. If virtual address space gets sparse I would just keep all files on harddrive and load them every time a frame needs them. When ~10 consecutive frames did not use an image I would unload it.
Can you elaborate more on this. What exactly happens when you swap the menu images and what would be the desired outcome? |
Seems like the game doesn't really use the IDirectDrawSurface7 Blt (or BltFast/BltBatch) to draw each building to the screen. It looks like it draws everything by itself onto the locked surface each frame, instead of creating surfaces for each building/floor/unit.
This can be best seen while in the main menu. It Blts a surface the size of the screen and as soon something changes (like the mouse hovering over a button) it locks the surface..
I checked the Clipper, attached surfaces, overlay surfaces, nothing.. |
The game uses many GDI functions. Can you check BitBlit? I think GDI is also used for drawing the panels. Maybe its the same for the buildings.
The game does not create a surface for each object. The game uses about 13 layers / surfaces. Each surface is refreshed at different rates. One surface holds the mini map, another one the menus panels, one the terrain/background and one is for the objects, buildings etc. When the game wants to render a frame it just puts all these layers on top of each other. |
How can I patch GDI functions, without creating a proxy DLL? |
Create a normal DLL/ASI project in visual studio. Follow this example. Make sure to have GDI.dll in your imports table. In dllmain just write a near jmp at the beginning of BitBlt. Just replace the first two bytes for this. Let it jump 5 bytes in front of the function (jmp BitBlt-5) (there is a 2 byte near jump instruction for this, 0xEB 0xF9). Create a far jmp (0xE9) instruction 5 bytes in front of the functions entry and let it jump to your hook procedure. Microsoft named this procedure "hotpatching" and it is well defined for WinAPI functions. Microsoft always has a two byte nop at the beginning of each WinAPI function that you can overwrite. Many libraries can set this hook for you. For the hook procedure create a function with the same calling convention and arguments as BitBlt. At the end of your hook procedure call the original BitBlt function and offset it by 2 bytes (call BitBlt+2). Make sure to return its return value. It should look like this:
Many hooking libraries already do this kind of stuff for you. You can also be more intrusive by writing your 5 byte patch directly at BitBlt (thus overwriting the frame setup). If you do this, make sure to repair the frame in your hook procedure. |
Nope, it's not BitBlt, StretchBlt, PlgBlt or MaskBlt. BitBlt get's called like 10 times in the initialization of the engine, but then never again. There are a lot of SelectObject calls and a few CreateCompatibleBitmap calls (although no CreateBitmap) I'm slowly going through every function that would eventually need to call SelectObject.. |
Then I am out of ideas. Maybe BB has used an engine that does the blitting manually. I never looked any deeper than the frame composing code that overlays all the ui layers. But i am pretty sure that they were using a windows function since i once stepped over each of those layers to visualize what parts of the frame they contain. Maybe i still have some notes somewhere. Do you still know the address of the vtable you hooked when hooking Blt? Maybe you got a different interface than the game requested and thus your hook was never called simply because of different interfaces. |
I found it! It creates the images with CreateRectRgn! At least it calls the function everytime something happens |
Well, I use minhook to hook the function calls, and they definitely get called, some more than others |
Ah okay. Then the game uses IDirectDrawSurface7::Blt only to draw the intermediate layers only. Good catch! I just tested a bit and could not find where the game calls CreateRectRgn. It looks like it is only called by gdi itself when drawing a text. Never for sprites or other content. The game uses DrawTextA to draw the texts in the panels. |
Yeah,it uses user32.dll and win32u.dll to draw the information! The exe imports a lot of functions from USER32, but none from WIN32. |
I think we can get some pretty accurate function informations, when we take a look at the GfxEngine.dll of the editor, as that exports a lot of gfx related functions, like BlitFrameToDib, CreateGuiSurface, GetPlayerColor or RenderObject/RenderResource. I don't think, that they changed the functions that much, so we could just byte search the game exe after those functions in the dll. We really only need RenderObject/RenderResource and we need to find where the ground textures are rendered |
So have you made any progress? I realized that i already found the render function when creating the unlimited selection mod. We should start creating a wiki or something so we stop reversing things twice ^^
The game is actually coded in a really modular way. I don't see any reason why we could not add and render our completely own units (like spearmen). Note that the game renders on an intermediate surface. It will be blitted onto the backbuffer by the function we already found above. I think i would not follow the "watermarking" approach anymore. Lets just mod the game to use the unpacked gfx files altogether. We could even do alpha blitting. The blitting function the game uses is at S4_Main.exe+25F980. It is does palette mixing before blitting (for the team colors). It is a fastcall (with 9 arguments, the first two passed by ecx and edx, the rest on stack). Caller cleans the stack. |
Yeah, that was the plan with this approach. Have every extracted png of a gfx packed together in a custom gfx file and just load these. This would make it much easier to upscale the images, without any qualityloss due to engine restrictions. This could also enable us to add real soft shadows to objects, instead of that black pattern they used, like you said. |
Should we hook each "Render..." function, or should we just hook their blit method at S4_Main.exe+25F980 and blit ourself to the backbuffer, if we recognize the texture to be drawn? |
I'm currently facing problems with hooking into the blit function at S4_Main.exe+25F980 and calling the trampoline function. Here is my typedef of the blit function I got from Ghidra and my hook + orig function call:
|
All the render functions call the Blt function multiple times (eg. towers have doors that are painted over the building). Since we have to draw each of these too, we would have to completely reimplement the Render functions. Imho, thats too much work (but not impossible). I would rather reimplement the Blt function. I would do stack hacking in the hook procedure to extract the exact sprite that we have to draw from the stack. All other stuff (like position on screen etc) is already in the arguments of the Blt function.
Don't trust ghidra blindly. It is flawed when it comes to "fairly" recent compiler quirks. __fastcall (as specified for the Microsoft compiler) expects the function to do the stack cleanup (similar to __stdcall). However the function you are trying to hook is a "compiler invented calling convention" (Raymond Chen has an article about that). That means the compiler created a new calling convention for optimization purposes. It is almost a fastcall but this one expects the caller to do the stack cleanup. If you implement it like you did above, it will crash since now the caller AND your hook proc will clean the stack. I am not sure if many libraries can handle this for you. You will likely have to use inline assembler for this as i am not aware of any possibility to declare new custom calling conventions. |
Okay, just turns out, that minhook is absolute trash and it completely destroys the instructions of the function with that inserted jmp. Gonna switch to polyhook2 and hope it gets better.
Wouldn't that mean I would need to make the new method __declspec(naked) and just return with something in EDX and EAX? I don't even know what the original blit really returns to the render functions to be honest ^^ |
Well that is kind of expected. As long as it repairs the replaced instructions (by adding them to the hook procs prolog) its fine. After all this function is not compiled with /hotpatch and thus has no 2-byte nop at the beginning.
I would write a naked function that converts it to stdcall. Here is an example. and here a more complex one. This also allows you to push the return address as an additional argument (useful for stack hacking later). And it allows you to repair the instructions you replaced.
Its just a VOID function (return value is ignored by the game). |
Sadly that isn't the case. An innocent looking "mov eax, dword ptr[]", gets obliterated to an "or" opcode |
Well, I think I am at my limit regarding C++ and hooking into functions. It either crashes, because the stack gets corrupted or it outright fails to hook... And the fact, that this would be a normal S4ModApi plugin suggests, that it would be the best, if you could maybe supply a callback to the blit function in the API itself, where one could decide whether the currently scheduled render job should be handled by the the callback, or the original blit function. If that doesn't bother you @nyfrk of course |
Yes I can do that. Edit: See here: https://github.com/nyfrk/S4ModApi/releases/tag/v0.5 You can now use the AddBltListener function. I added a Here is an example that skips the drawing whenever the 'U' key is down.
If you are interested: That is the code that implements it. I observed that we must preserve the XMM registers so i saved them onto the stack too. I also did that for the FPU just to be safe. But I haven't tested yet what XMM register or whether the FPU must be saved. I will do that later. Let me know if you need access to the XMM registers as they probably contain useful information. I will then think about a solution. |
Do you know the color format of the backbuffer arg7? And what is the size, screen width * screen height * bytes per pixel, or is there something with the zoom level that we would have to consider? And oh boy do I not like C++.. I think I am just gonna call a C# library with all that information xD. I mean, type safety and a nice library system, what is there not to like, and it only adds one more .dll to all the other ^^ |
And a few notes for me for later: Needed magic number for identification of upscaled files:
even though the first byte is nearly always a "0", we make sure to make our identification number unique then the gfx id:
then the id of the file inside the gfx:
Remove the offset and then combine to an int16. We need 16 bit because some .gfx files contain up to 19,000 images, but no more than the max of int16 (65,535). then the rest of the file, which will be ignored ** - these are offset by 2, to maintain a bit of compatibility with the original renderer, in case of api errors, so that the image could still be displayed. |
Or we could just create a new surface and draw to that instead of using theirs. That would be a better solution in regards to making the HD Patch. To make the images truly HD, we would have upscale the ground buffer and draw our objects with the new HD resolution to the new surface and blit that to the window |
The original Blt function does determine how to scale the image. I think I can work that out and add a destination rect (and a source rect) to the arguments of the callback. You can use it to calculate a scale but I would not start using floating point operations just for that.
I am not sure. The game should execute the Blt function already in the correct order.
I thought about that. Having a second surface to draw on would mean that we must draw all objects onto that new surface (we cannot have some on the original surface and some on the new one). We cannot blend the old and new surfaces later since it will break the z order of objects. Converting the games Blt function to make it render to a RGBA32 surface would probably not be too difficult. I don't think there are more functions that render on that surface. The final blitting to the client area should not need any changes since it uses the Blt method of the DirectDrawSurface and that should be able to handle RGBA32 sources (at least we can make it work, since i know that AlphaBlending is possible). The advantage of this is that your mod would not break other mods that for example add new units, tribes or buildings to the game (e.g. if someone decides to make a plugin that adds the tribes from settlers 3 to the game). I think it would be a good idea to add a Blt function to the S4ModApi. That Blt function must be able to process RGBA32 or palettized RGB565 images. This way other mods can extend the RenderObject method without making it incompatible with your RGBA32 mod. They could coexist or it would even be possible for you to provide RGBA32 sprites of the additional tribe too since it would just be another blit observed by your listener. However if you don't provide RGBA32 sprites for the additional units everything would still work fine. Having a Blt function in S4ModApi would allow us to make all the necessary implementations that are desired anyway when blitting object sprites. Like fog (objects becoming darker), team colors or that "growing building" animation when erecting a building. So basically I suggest that we add the following functions to the S4ModApi
|
Thing is, the blit function has the gfx data and palette, so its rather easy to just create that texture ourself, with the exporter methods. Do you know how the black view range/radius is drawn? Is it just another surface, that gets drawn on top of all objects? |
I would not cut that down. My suggestion is that, instead of you doing the blitting yourself in your plugin, we would create a Blt method in the S4ModApi. You would still get the data of the original GFX file (and all the other arguments) but you would call s4->BltObjectRGBA(rgbaImage,...) to get it drawn onto the games surface. If the library does that, we can ensure compatibility with other mods that add graphics to the game that do not yet exist. The S4ModApi would basically hide the fact that it swaps the games surface to a RGBA32 surface (thus allow for some abstraction). The question is now: How would the argument list of the s4->BltObjectRGBA look like? How would we want to add team colors or the different fogged versions of the sprites? I guess that we don't want to create 8 versions of a settler and for each of these versions 7 versions of darkened versions. So we must somehow mask them or allow for passing a lambda function that does the darkening. Then the next question is how would we want to customize the "growing building" animation? That is handled by the original Blt function too.
I assume that the palettes are layed out in a way that allow to darken the colors when decrementing the color. The game just decrements each pixel by a certain amount to draw a darker image. When you blit your RGBA images, you would probably decrement each channel of each pixel by a certain amount to achieve a similar effect. |
But how would you achieve z ordering?
Like you said. The game executes these commands in a specific order.
We could use a two layer method. Have a unit texture without the team coloring and a second one with just the team coloring which will be tinted to the correct team.
The question was rather, how does the game know, when to tint the pixel dark? There has to be some sort of array storing that information. |
The plugins would still have to register a Blt listener and call the custom blt method from there thus ensuring the correct order. The plugin code could look something like this:
The arguments of the callback will be replaced by a single argument that gives a pointer to a struct that contains all the current arguments. This way plugins can alter the arguments and as a bonus the loop can be more efficient since we do not have to repush all arguments onto the stack each time we iterate the observers.
So essentially you would use some kind of mask and change the hue similar to how one would do it in Photoshop when using a mask to select the area to work with. I am not sure how good this can be quality-wise. It would probably be easier if we would just use the S4GFX tool to export 8 versions of each sprite (one per team color). This would allow for maximum of customization and be probably easier for us (at the expense of memory usage of course). Edit: Quality problems can be solved when using a palettized mask image and then providing 8 different palettes.
This information is stored in the terrain array. It is an array of DWORDs. The 1st byte is the terrain type (grass, sand, etc), the 2nd is the height and the 4th is related to the fow (I think the fow level was the 3 least significant bits of this 4th byte). I think the fow of the currently blitted world position is already in one of the static variables. So it shouldn't be too hard to figure that out. |
Problem is with adding new units currently is, how would you add them? There is no really easy way of adding any new units to the .gfx files, as the direction and job index list system is rather complicated. So there won't be any way of hacking the first few bytes of the image to point to our external .png file. The only way I could see is to hack existing units to facilitate custom units and then somehow identify them later in the blit function.
Yeah, thats what I meant. But I think it would be easier to just have 8 versions of the units, though that would 8 fold our space consumption. I'd need to check how much more space is really consumed in the end, when we're packing everything |
That is not a problem the HD patch must solve but I would add new objects by expanding the switch in the RenderObject function of the game. The game will try to blit the first image in the gfx archive which is usually a black placeholder if an identifier is unknown so we just hook that and instead draw a the correct image. For the logic it would be just another instance of the CSettler or CBuilding class in the object pool.
To reduce memory usage we could create a blitting function that directly blits from compressed images. After all we have to write our own blitter anyway so this would probably be a reasonable choice. Could you add an export mode that allows us to output for each sprite 8 colored versions as premultiplied RGBA32 bitmaps with 3 times the resolution of the original? For images that do not use team colors (like trees, animals etc) there would be just one version of course. Then we could see about how many Gigabytes we are talking. We could multiply this by 4/3 to estimate the cost of mip maps too. The reason we should choose 3 times the original resolution is that the game allows only scales images up to 3 times. More resoultion would be pointless unless we hack the game to allow more zoom. Edit: With 3 times the resolution I mean that a 2x2 pixel large sprite should become 6x6 pixels large. |
I'm gonna update the exporter to apply the team colors
Isn't 3x a bit overkill? 2x can already result in big file sizes. you have to consider, that we are talking about 18000 sprites per tribe that x8 x3 is A LOT. The buildings are at 2x resolution saved as png already at 213 MiB. |
Ok, lets try 2x first. |
I've exported and scaled every texture in 20.gfx. With 2x scaling, billinear, we get around 43MiB +- some MiBs for more pixel information, when AI upscaling. So with a full lobby we would see around 350MiB (43*8 assuming all other tribes roughly share the same space requirements) in memory usage just for the units, plus the requirement for the buildings, but that would only be around 860 MiB at max when loading every tribe and every building at once. Question is, do we just load the whole texture container once when needed, or do we load only the textures on demand (Probably the whole image group containing all the animation textures) which could result in lag, as reading from files is rather slow, even if we cache the textures for a while. If we go with just loading the whole file, we should probably make the game large address aware, so that we don't run out of x86 memory |
Is this using RGBA32 bmp's or png's? png's are compressed and must be uncompressed when blitting. Furthermore we should consider whether to use mip maps. Otherwise we could get ugly effects due to the scaled blitting when zooming out.
Memory mapping the container would probably be the easiest and fastest solution. At least that lets windows handle all the page swapping for us.
True, but we cannot set the laa flag for DLLs. (Well we can, but it has no effect since the flag on the main executable determines whether the entire process is laa). So we would have to set the flag for the S4_Main.exe. I am not sure how well the game operates with negative memory pointers. I think the game sometimes uses the sign bit to check whether a pointer is valid. So this must be carefully tested. |
Using pngs. But the compression isn't that high, as they are rather small files to begin with. We could save a few bytes if we save the files as raw color streams, but with run length encoding. RLE is rather fast to decode and can save a bit of space.
I don't really know of mip maps are that necessary, if we use a scaling algorithm like bilinear or trillinear, when blitting. If we use a custom surface, we could just use hardware accelerated direct2d, which gives us these options when blitting |
Lets start simple. I propose that we just concatenate all PNGs into one big file. Then we will using libpng to load and blit them onto the surface and hope that it will be fast enough. Converting them as RGBA32 bitmaps will definitely blow our memory usage.
I am not sure about that. Hardware accelerated blitting will probably be difficult since we need extra features like "growing buildings" or fog of war fading. I would create a software solution first (similar to the original function). |
How is the progress on the renderer @nyfrk? |
I am sorry. I currently do not have much free time. I will continue working after the 5th of november on open source projects. |
I did some work today. I ditched the idea of using libpng and rather used Gdiplus. Gdiplus comes with rgb565 support by default, does support more formats (png, jpeg, etc) and provides nice blitting features (like polyline clipping for the erecting building animation). The width and height of an image to render is determined by a table lookup. The game manages a table that is modified whenever the player changes the zoom. It maps widths/heights of a gfx to a width/height to render. I will add the table to the API too. If you don't want to wait for the API, you can use Looks like some graphics are already rendered in high resolution. When fully zoomed in, thin objects like grass are displayed almost 1:1 (high resolution). Whereas the chicken is upscaled (low resolution). We could use the flag to feed the game higher resolution images. Though that would still limit us to the palette. One annoyance with hooking the Blt function is that we have no way to determine the fog of war level at the currently rendered object since the fog is already computed beforehand by passing an adjusted (darkened) palette. We cannot make a quick terrain lookup since we do not know the x and y position of the object on the map (at least we dont know it in the hook procedure). Fortunately the game is very optimized and only knows one palette for any fogged object (despite there being multiple fow levels or different objects/team objects). So we can simply check the palette whether it is the fow palette and if it is, we draw the sprite slightly darkened. Here is a demo that renders colored boxes behind objects. (Later we would remove them and blit real images). Fogged sprites are rendered with a dark red box. It looks good so far but there are still some issues we must fix:
|
That is good news! I implemented the identification of the gfx files today aswell. It's more experimental, so if you need the feature i'd send you the application to apply it to a gfx file How does the game handle the building animation? There has to be some kind of info tied to the object, indicating the progress, isn't there? |
The game uses argument _6 for this. _6 is the current y position of the zigzag clipper. |
Oh Well, I don't know how we could create that zig zag procedurally in code, but we could just create a mask texture and draw the building texture with the mask applied. That would fix that |
I think i would first try the region clipping method of Gdiplus. |
Yes, absolutely. I was rather thinking about a way to preserve the old way of building, for the future We could also use different images for steps of the building process, but I think I am getting ahead of myself ^^ |
The nice thing about using a region clipper is that we should be able to translate it just using argument _6.
This is definitely possible. We could calculate the percent of the construction and just change the image whenever we want. Edit: I tested that and it is not that easy. The "clipping mode" is only used when a part of the zigzag pattern is actually visible. If it is not, we cannot distinguish it from a non-built (zigzag pattern is below bottom edge of the screen) or already built (zigzag pattern is above top edge of the screen) building. Furthermore the argument _6 is capped such that it is never below the bottom edge of the screen. That means, we cannot do the percent calculation when the zigzack pattern is not visible. We cannot use different images for different states, unless we change how the game calls the Blt function. However, we can still change the zigzag pattern to something different like triangles, sine or any other shape that fits within the 30 pixels height. The transparent-blue is the clipping region. |
Well the already built building has a different id altogether, so that wouldn't really matter So until we could maybe get the actual progress of the construction, the way the object itself handles it, we have to stay with the clipped version Is there a way to get the corresponding object of what we draw? Or just a unique identifier of the object that's being drawn |
The actual progress is stored in the CBuilding object in the object pool. We could get a reference to the rendered object by hooking these functions. Or we could just try to get a reference by stack hacking. There is actually two identifiers that are of interest:
Both of them could be somewhere on the stack when the Blt method is called. |
HDC would be nice since it would abstract from the color depth. The problem is that I don't know a good and fast way to create/manage an HDC for that purpose. You will probably have to construct your Gdiplus::Graphics object yourself from the raw arguments. This seems to be fast (at least acceptable), but it does not abstract from the color depth and constructing it for each object to render is kind of (unnecessary) slow. I am not sure if it is a good idea to replace the HDC with a Pointer to a Gdiplus::Graphics Object (is it safe to share Gdiplus::Graphics across libraries?). That would resolve the speed and abstraction problem though. So far I prepared the following struct for you. It will be the argument the S4BltHook will provide you:
|
Can't we get the HDC, when the game renders the finished surface? EDIT: Is there a way to get the length of |
We would have to create another hook such that we can blit our surface just before the UI is rendered. This could indeed work out. Do you already have some offset to hook for that?
The palette is always 256 colors large. The imageData is accompanied with |
No, I couldn't find one yet |
I just pushed my latest efforts to the S4ModApi project. You should be able to render high resolution sprites using Gdiplus. Here is a snipped that I used to render the boxes. You should be able to alter it to render the appropriate sprites. I will add true color support later. However you could probably also just blit the sprites to the final surface using the Frame Hook for true color sprites. |
@nyfrk I think we should go with the solution you recommended in #4 and hack the game's calls to the DirectDrawSurface to get a better image quality.
Some game images are hardcoded and very hard to upscale in a native way so that the game can understand them later on.
This would also allow us to change the main menu graphics, which was a limitation before.
The 4 byte identifier shouldn't be a problem. The only difficult bit would be redirecting the calls to load our own image and we would also need to be able to compress our images, as they can get rather numerous and large in disk space. [72.8MB for 20.gfx decompressed]
What would be needed to achieve this? Are we making a wrapper dll for ddraw, or replacing the calls to CreateSurface and LockSurface in the games exe?
The text was updated successfully, but these errors were encountered: