Switching to custom DDraw texture rendering #11

WizzardMaker · 2020-09-08T15:36:01Z

@nyfrk I think we should go with the solution you recommended in #4 and hack the game's calls to the DirectDrawSurface to get a better image quality.

Some game images are hardcoded and very hard to upscale in a native way so that the game can understand them later on.
This would also allow us to change the main menu graphics, which was a limitation before.

The 4 byte identifier shouldn't be a problem. The only difficult bit would be redirecting the calls to load our own image and we would also need to be able to compress our images, as they can get rather numerous and large in disk space. [72.8MB for 20.gfx decompressed]

What would be needed to achieve this? Are we making a wrapper dll for ddraw, or replacing the calls to CreateSurface and LockSurface in the games exe?

nyfrk · 2020-09-09T17:18:32Z

@nyfrk I think we should go with the solution you recommended in #4 and hack the game's calls to the DirectDrawSurface to get a better image quality.

What would be needed to achieve this? Are we making a wrapper dll for ddraw, or replacing the calls to CreateSurface and LockSurface in the games exe?

I would start simple. Don't bother wrapping DDRAW.dll. I already made an ASI loader that gets you into process space. Just create an ASI Plugin, find the vtable of the IDirectDrawSurface, hook its Blt method (i.e. the 5th entry in the table). In hook procedure get the 4 byte watermark of the source and replace the source image accordingly. The target surfaces should be already RGB24. Just make sure to have DDRAW.dll in your import table (dynamically loading it using LoadLibrary is not good when in loader lock of DllMain). This should allow you to blt images without palette limitations. Another aspect is the team colors. We will probably need every unit in 8 versions each with a different team color. The team colors the game uses are documented here. The team colored sprites have an annoying side effect: we have to read the team palette to determine which version we must draw.

The 4 byte identifier shouldn't be a problem. The only difficult bit would be redirecting the calls to load our own image and we would also need to be able to compress our images, as they can get rather numerous and large in disk space. [72.8MB for 20.gfx decompressed]

I would probably put all bitmaps in a large file and then memory map it. If virtual address space gets sparse I would just keep all files on harddrive and load them every time a frame needs them. When ~10 consecutive frames did not use an image I would unload it.

Some game images are hardcoded and very hard to upscale in a native way so that the game can understand them later on.
This would also allow us to change the main menu graphics, which was a limitation before.

Can you elaborate more on this. What exactly happens when you swap the menu images and what would be the desired outcome?

WizzardMaker · 2020-09-09T20:41:40Z

Seems like the game doesn't really use the IDirectDrawSurface7 Blt (or BltFast/BltBatch) to draw each building to the screen. It looks like it draws everything by itself onto the locked surface each frame, instead of creating surfaces for each building/floor/unit.

DDWRAPPER: Lock Surface Wrapper called! 
DDWRAPPER: Lock Surface Wrapper called! 
DDWRAPPER: Lock Surface Wrapper called! 
DDWRAPPER: Lock Surface Wrapper called! 
DDWRAPPER: Blit to l:15 t:8 r:255 b:168, from l:0 t:0 r:240 b:160 
DDWRAPPER: Blit to l:15 t:8 r:255 b:168, from l:0 t:0 r:240 b:160 
DDWRAPPER: Blit to l:0 t:0 r:281 b:210, from l:0 t:0 r:281 b:210 
DDWRAPPER: Blit to l:209 t:928 r:984 b:1026, from l:0 t:0 r:775 b:98 
DDWRAPPER: Blit to l:0 t:250 r:210 b:600, from l:0 t:0 r:210 b:350 
DDWRAPPER: Blit to l:0 t:600 r:210 b:768, from l:0 t:0 r:210 b:168 
DDWRAPPER: Blit to l:0 t:768 r:210 b:1024, from l:0 t:0 r:210 b:256 
DDWRAPPER: Blit to l:0 t:210 r:210 b:250, from l:0 t:0 r:210 b:40

This can be best seen while in the main menu. It Blts a surface the size of the screen and as soon something changes (like the mouse hovering over a button) it locks the surface..

DDWRAPPER: Blit to l:0 t:0 r:1309 b:1027, from l:17 t:0 r:782 b:600 
DDWRAPPER: Blit to l:0 t:0 r:1309 b:1027, from l:17 t:0 r:782 b:600 
DDWRAPPER: Blit to l:0 t:0 r:1309 b:1027, from l:17 t:0 r:782 b:600 
DDWRAPPER: Blit to l:0 t:0 r:1309 b:1027, from l:17 t:0 r:782 b:600 
DDWRAPPER: Lock Wrapper called! 
DDWRAPPER: Lock Wrapper called! 
DDWRAPPER: Lock Wrapper called! 
DDWRAPPER: Blit to l:0 t:0 r:1309 b:1027, from l:17 t:0 r:782 b:600 
DDWRAPPER: Lock Wrapper called! 
DDWRAPPER: Lock Wrapper called! 
DDWRAPPER: Blit to l:0 t:0 r:1309 b:1027, from l:17 t:0 r:782 b:600

I checked the Clipper, attached surfaces, overlay surfaces, nothing..

nyfrk · 2020-09-09T20:51:28Z

The game uses many GDI functions. Can you check BitBlit? I think GDI is also used for drawing the panels. Maybe its the same for the buildings.

It looks like it draws everything by itself onto the locked surface each frame, instead of creating surfaces for each building/floor/unit.

The game does not create a surface for each object.

The game uses about 13 layers / surfaces. Each surface is refreshed at different rates. One surface holds the mini map, another one the menus panels, one the terrain/background and one is for the objects, buildings etc. When the game wants to render a frame it just puts all these layers on top of each other.

WizzardMaker · 2020-09-09T21:10:15Z

How can I patch GDI functions, without creating a proxy DLL?

nyfrk · 2020-09-09T21:34:16Z

Create a normal DLL/ASI project in visual studio. Follow this example. Make sure to have GDI.dll in your imports table.

In dllmain just write a near jmp at the beginning of BitBlt. Just replace the first two bytes for this. Let it jump 5 bytes in front of the function (jmp BitBlt-5) (there is a 2 byte near jump instruction for this, 0xEB 0xF9). Create a far jmp (0xE9) instruction 5 bytes in front of the functions entry and let it jump to your hook procedure. Microsoft named this procedure "hotpatching" and it is well defined for WinAPI functions. Microsoft always has a two byte nop at the beginning of each WinAPI function that you can overwrite. Many libraries can set this hook for you.

For the hook procedure create a function with the same calling convention and arguments as BitBlt. At the end of your hook procedure call the original BitBlt function and offset it by 2 bytes (call BitBlt+2). Make sure to return its return value.

It should look like this:

BitBlt-5  jmp BitBltHook
BitBlt+0  jmp BitBlt-5
BitBlt+2  ...

BOOL BitBltHook(...) {
    // do your stuff here
    auto OrigBitBlt = (BOOL (* __stdcall BitBlt)(...)) (BitBlt+2)
    return OrigBitBlt(...)
}

Many hooking libraries already do this kind of stuff for you. You can also be more intrusive by writing your 5 byte patch directly at BitBlt (thus overwriting the frame setup). If you do this, make sure to repair the frame in your hook procedure.

WizzardMaker · 2020-09-09T22:29:50Z

Nope, it's not BitBlt, StretchBlt, PlgBlt or MaskBlt. BitBlt get's called like 10 times in the initialization of the engine, but then never again.

There are a lot of SelectObject calls and a few CreateCompatibleBitmap calls (although no CreateBitmap)

I'm slowly going through every function that would eventually need to call SelectObject..

nyfrk · 2020-09-09T22:56:05Z

Then I am out of ideas. Maybe BB has used an engine that does the blitting manually. I never looked any deeper than the frame composing code that overlays all the ui layers. But i am pretty sure that they were using a windows function since i once stepped over each of those layers to visualize what parts of the frame they contain. Maybe i still have some notes somewhere.

Do you still know the address of the vtable you hooked when hooking Blt? Maybe you got a different interface than the game requested and thus your hook was never called simply because of different interfaces.

WizzardMaker · 2020-09-09T22:59:25Z

I found it! It creates the images with CreateRectRgn! At least it calls the function everytime something happens

WizzardMaker · 2020-09-09T23:07:19Z

Do you still know the address of the vtable you hooked when hooking Blt? Maybe you got a different interface than the game requested and thus your hook was never called simply because of different interfaces.

Well, I use minhook to hook the function calls, and they definitely get called, some more than others

nyfrk · 2020-09-09T23:55:17Z

Well, I use minhook to hook the function calls, and they definitely get called, some more than others

Ah okay. Then the game uses IDirectDrawSurface7::Blt only to draw the intermediate layers only. Good catch!

I just tested a bit and could not find where the game calls CreateRectRgn. It looks like it is only called by gdi itself when drawing a text. Never for sprites or other content. The game uses DrawTextA to draw the texts in the panels.

WizzardMaker · 2020-09-10T00:10:17Z

Yeah,it uses user32.dll and win32u.dll to draw the information!
SetDIBits is being called from user32, which in turn is being called from win32u

The exe imports a lot of functions from USER32, but none from WIN32.
The most interesting imported functions are: LoadBitmapA and LoadImageA

WizzardMaker · 2020-09-10T01:41:24Z

Okay. Seems like they extended the IDirectDrawSurface7 interface.

The vtable function call to v4[1] + 100 goes directly to the IDirectDrawSurface7->Lock function (or rather my patched version)

Also, this function gets apparently called in the main draw function. (There are mentions of "GFX ENGINE: Can't render objects without having a world!" and "GFX ENGINE: DATA ERROR: Illegal value in iDirection", so thats where my conclusion comes from)

nyfrk · 2020-09-10T02:07:12Z

Yeah,it uses user32.dll and win32u.dll to draw the information!
SetDIBits is being called from user32, which in turn is being called from win32u

Sorry i cannot follow along. GDI32.SetDIBits calls gdi32full.SetDIBits. The latter is an internal library. There is no SetDIBits in User32.dll.
win32u.dll is an internal library. I cannot imagine that the game uses win32u.dll directly. If it would, it would dynamically link to it. I dont see a reason why BlueByte should have done this. It is probably rather imported by one of the system libraries. Furthermore the game imports GDI32.GetDIBits and not GDI32.SetDIBits. Thus SetDIBits is probably called only by functions from one of the libraries and not the game itself. LoadBitmapA causes a call to SetDIBits.

The exe imports a lot of functions from USER32, but none from WIN32.

The game uses User32.LoadBitmapA (which redirects to User32.LoadImage) to load images from the resource section. To be precise, it loads the cursor from the resource section. LoadBitmapA does not allow to load images from memory or from the gfx archives. There are only 2 images in the resource section: and .

The game uses a streamer to dynamically load and cache images that are needed to render. It unloads them when they are not on screen anymore (with some exceptions and a little delay). It loads the images directly from the gfx archives (hard drive) during gameplay.

Okay. Seems like they extended the IDirectDrawSurface7 interface.

Thats impossible ;)

You are right! Looks like they replaced the IDirectDrawSurface7::Blt function with one that reads stuff from the gfx archives (or well, they rather hacked the internals of DDRAW.dll with their overlay.dll). They also cache the images so that there is not too many read operations from the hard drive. So basically the DDRAW.dll is not a real DDRAW.dll that was written/released by Microsoft ^^

I always wondered why their Blt method has one argument more than usual haha. Guess that explains it. In that case have a look at the opcode: S4_Main.exe+266BC3. That Blt method is sometimes magically causing read operations to the hard drive ;)

WizzardMaker · 2020-09-10T02:12:46Z

I think we can get some pretty accurate function informations, when we take a look at the GfxEngine.dll of the editor, as that exports a lot of gfx related functions, like BlitFrameToDib, CreateGuiSurface, GetPlayerColor or RenderObject/RenderResource.

I don't think, that they changed the functions that much, so we could just byte search the game exe after those functions in the dll. We really only need RenderObject/RenderResource and we need to find where the ground textures are rendered

nyfrk · 2020-09-11T06:26:08Z

So have you made any progress? I realized that i already found the render function when creating the unlimited selection mod. We should start creating a wiki or something so we stop reversing things twice ^^
Anyway here are the render functions for the settlers 4 that may help us (md5 of my S4_Main.exe is C13883CBD796C614365AB2D670EAD561. Let me know if your version differs from mine, i will then send you aob patterns):

S4_Main.exe+261B90    Render a settler (including selection markers)
S4_Main.exe+263110    Render a border stone
S4_Main.exe+262E80    Render an object (like tree, stones, chickens, geologist signs, good piles (except those attached to buildings) etc)
S4_Main.exe+262090    Render a building. This blits many times e.g. for flags on buildings, settlers working in building, doors on towers, piles on buildings, building effects like rotating rotor of the mill or the anvil of the smiths, selection markers etc
S4_Main.exe+263800    Render that colored bubbles when placing a building or changing the work area of a building.
S4_Main.exe+261FA0    Render the waves on the water.
S4_Main.exe+2631F0    Render a vehicle (ships, war machines etc).

The game is actually coded in a really modular way. I don't see any reason why we could not add and render our completely own units (like spearmen). Note that the game renders on an intermediate surface. It will be blitted onto the backbuffer by the function we already found above. I think i would not follow the "watermarking" approach anymore. Lets just mod the game to use the unpacked gfx files altogether. We could even do alpha blitting. The blitting function the game uses is at S4_Main.exe+25F980. It is does palette mixing before blitting (for the team colors). It is a fastcall (with 9 arguments, the first two passed by ecx and edx, the rest on stack). Caller cleans the stack.
The team color palette mixing is probably the reason BlueByte created a custom blitting function.

WizzardMaker · 2020-09-11T11:14:36Z

Lets just mod the game to use the unpacked gfx files altogether

Yeah, that was the plan with this approach. Have every extracted png of a gfx packed together in a custom gfx file and just load these. This would make it much easier to upscale the images, without any qualityloss due to engine restrictions. This could also enable us to add real soft shadows to objects, instead of that black pattern they used, like you said.

WizzardMaker · 2020-09-11T14:22:40Z

Should we hook each "Render..." function, or should we just hook their blit method at S4_Main.exe+25F980 and blit ourself to the backbuffer, if we recognize the texture to be drawn?

WizzardMaker · 2020-09-11T19:07:33Z

I'm currently facing problems with hooking into the blit function at S4_Main.exe+25F980 and calling the trampoline function.
The new function gets called, but the esp run time check gets caught and the game crashes.

Here is my typedef of the blit function I got from Ghidra and my hook + orig function call:

typedef ULONGLONG(__fastcall origBlitToBackbuffer)(int EAX, byte* pEDX, int param_1_00, int param_2_00, int param_5, UINT param_6, int param_7, int param_8, int param_9);

origBlitToBackbuffer *oRO;
ULONGLONG __fastcall BlitToBackbuffer(int EAX, byte* pEDX, int param_1_00, int param_2_00, int param_5, UINT param_6, int param_7, int param_8, int param_9)
{
	return oRO(EAX, pEDX ,param_1_00, param_2_00, param_5, param_6, param_7, param_8, 0);
}

nyfrk · 2020-09-12T00:04:31Z

Should we hook each "Render..."

All the render functions call the Blt function multiple times (eg. towers have doors that are painted over the building). Since we have to draw each of these too, we would have to completely reimplement the Render functions. Imho, thats too much work (but not impossible). I would rather reimplement the Blt function. I would do stack hacking in the hook procedure to extract the exact sprite that we have to draw from the stack. All other stuff (like position on screen etc) is already in the arguments of the Blt function.

Here is my typedef of the blit function I got from Ghidra and my hook + orig function call:

Don't trust ghidra blindly. It is flawed when it comes to "fairly" recent compiler quirks.

__fastcall (as specified for the Microsoft compiler) expects the function to do the stack cleanup (similar to __stdcall). However the function you are trying to hook is a "compiler invented calling convention" (Raymond Chen has an article about that). That means the compiler created a new calling convention for optimization purposes. It is almost a fastcall but this one expects the caller to do the stack cleanup. If you implement it like you did above, it will crash since now the caller AND your hook proc will clean the stack. I am not sure if many libraries can handle this for you. You will likely have to use inline assembler for this as i am not aware of any possibility to declare new custom calling conventions.

WizzardMaker · 2020-09-12T00:32:36Z

Okay, just turns out, that minhook is absolute trash and it completely destroys the instructions of the function with that inserted jmp. Gonna switch to polyhook2 and hope it gets better.

expects the caller to do the stack cleanup

Wouldn't that mean I would need to make the new method __declspec(naked) and just return with something in EDX and EAX? I don't even know what the original blit really returns to the render functions to be honest ^^

nyfrk · 2020-09-12T00:51:35Z

Okay, just turns out, that minhook is absolute trash and it completely destroys the instructions of the function with that inserted jmp. Gonna switch to polyhook2 and hope it gets better.

Well that is kind of expected. As long as it repairs the replaced instructions (by adding them to the hook procs prolog) its fine. After all this function is not compiled with /hotpatch and thus has no 2-byte nop at the beginning.

Wouldn't that mean I would need to make the new method __declspec(naked) and just return with something in EDX and EAX?

I would write a naked function that converts it to stdcall. Here is an example. and here a more complex one. This also allows you to push the return address as an additional argument (useful for stack hacking later). And it allows you to repair the instructions you replaced.

I don't even know what the original blit really returns to the render functions to be honest ^^

Its just a VOID function (return value is ignored by the game).

WizzardMaker · 2020-09-12T01:01:58Z

As long as it repairs the replaced instructions (by adding them to the hook procs prolog) its fine

Sadly that isn't the case. An innocent looking "mov eax, dword ptr[]", gets obliterated to an "or" opcode

WizzardMaker · 2020-09-12T03:30:43Z

Well, I think I am at my limit regarding C++ and hooking into functions. It either crashes, because the stack gets corrupted or it outright fails to hook...

And the fact, that this would be a normal S4ModApi plugin suggests, that it would be the best, if you could maybe supply a callback to the blit function in the API itself, where one could decide whether the currently scheduled render job should be handled by the the callback, or the original blit function.

If that doesn't bother you @nyfrk of course

nyfrk · 2020-09-12T12:53:06Z

Yes I can do that.

Edit: See here: https://github.com/nyfrk/S4ModApi/releases/tag/v0.5

You can now use the AddBltListener function. I added a caller parameter that allows you to see from where the Blt function was called (just in case it may be useful). Whenever you return a non-zero value the original Blt function is skipped (thus preventing the game to draw the current sprite).

Here is an example that skips the drawing whenever the 'U' key is down.

s4->AddBltListener([](DWORD _0, DWORD _1, DWORD _2, DWORD _3, DWORD _4, DWORD _5, DWORD _6, DWORD _7, DWORD _8, BOOL discard, DWORD caller)->BOOL {
     return GetAsyncKeyState('U') < 0;
});

If you are interested: That is the code that implements it. I observed that we must preserve the XMM registers so i saved them onto the stack too. I also did that for the FPU just to be safe. But I haven't tested yet what XMM register or whether the FPU must be saved. I will do that later. Let me know if you need access to the XMM registers as they probably contain useful information. I will then think about a solution.

WizzardMaker · 2020-09-12T22:12:28Z

Great! That works flawlessly!

Now slowly finding out, what each argument does..
currently 5 out of 9

arg0 - ?
arg1 - the data buffer of the image, like its in the .gfx file
arg2 - probably the id of the object. returning when its under 100 removes stuff like water ripples, units and grass, above 100 removes ships and buildings
arg3 - ?
arg4 - x position to draw on the surface
arg5 - y position to draw on the surface
arg6 - ?
arg7 - the color buffer of the backbuffer
arg8 is maybe something with the minimap, but I'm not sure

Here is a simple plugin, that blocks all rendering, when the x value is below 500 or the id is below 100

WizzardMaker · 2020-09-12T22:33:10Z

Do you know the color format of the backbuffer arg7? And what is the size, screen width * screen height * bytes per pixel, or is there something with the zoom level that we would have to consider?

And oh boy do I not like C++.. I think I am just gonna call a C# library with all that information xD. I mean, type safety and a nice library system, what is there not to like, and it only adds one more .dll to all the other ^^

WizzardMaker · 2020-09-12T23:13:20Z

And a few notes for me for later:

Needed magic number for identification of upscaled files:

buffer [0] - byte 0xF0
buffer [1] - byte 0x0D

even though the first byte is nearly always a "0", we make sure to make our identification number unique

then the gfx id:

buffer [2] ** - byte 0x02 - 0xFF - though the max is only 0x29 (41)

then the id of the file inside the gfx:

buffer [3] ** - byte 0x02 - 0xFF
buffer [4] ** - byte 0x02 - 0xFF

Remove the offset and then combine to an int16. We need 16 bit because some .gfx files contain up to 19,000 images, but no more than the max of int16 (65,535).

then the rest of the file, which will be ignored

** - these are offset by 2, to maintain a bit of compatibility with the original renderer, in case of api errors, so that the image could still be displayed.

WizzardMaker · 2020-09-12T23:31:52Z

Do you know the color format of the backbuffer arg7? And what is the size, screen width * screen height * bytes per pixel, or is there something with the zoom level that we would have to consider?

Or we could just create a new surface and draw to that instead of using theirs.

That would be a better solution in regards to making the HD Patch. To make the images truly HD, we would have upscale the ground buffer and draw our objects with the new HD resolution to the new surface and blit that to the window

nyfrk · 2020-09-13T19:12:31Z

We could already draw our images, when we get the zoom level, as that defines, how to scale the image.

The original Blt function does determine how to scale the image. I think I can work that out and add a destination rect (and a source rect) to the arguments of the callback. You can use it to calculate a scale but I would not start using floating point operations just for that.

We also need to find out, how the game does its z sorting

I am not sure. The game should execute the Blt function already in the correct order.

I agree on the performance penalty in C#, we really need every ms we can get.

We could just render to a custom buffer, instead of fiddling with the surface, or how the orig. function draws to that. That buffer would be drawn everytime, when the game blits to the screen with the original IDirectDrawSurface

I thought about that. Having a second surface to draw on would mean that we must draw all objects onto that new surface (we cannot have some on the original surface and some on the new one). We cannot blend the old and new surfaces later since it will break the z order of objects.

Converting the games Blt function to make it render to a RGBA32 surface would probably not be too difficult. I don't think there are more functions that render on that surface. The final blitting to the client area should not need any changes since it uses the Blt method of the DirectDrawSurface and that should be able to handle RGBA32 sources (at least we can make it work, since i know that AlphaBlending is possible). The advantage of this is that your mod would not break other mods that for example add new units, tribes or buildings to the game (e.g. if someone decides to make a plugin that adds the tribes from settlers 3 to the game).

I think it would be a good idea to add a Blt function to the S4ModApi. That Blt function must be able to process RGBA32 or palettized RGB565 images. This way other mods can extend the RenderObject method without making it incompatible with your RGBA32 mod. They could coexist or it would even be possible for you to provide RGBA32 sprites of the additional tribe too since it would just be another blit observed by your listener. However if you don't provide RGBA32 sprites for the additional units everything would still work fine.

Having a Blt function in S4ModApi would allow us to make all the necessary implementations that are desired anyway when blitting object sprites. Like fog (objects becoming darker), team colors or that "growing building" animation when erecting a building.

So basically I suggest that we add the following functions to the S4ModApi

Blt, that allows us to blt onto the surface with optional fog, optional team color or/and optional building animation. That one can handle RGBA32 or the original palettized RGB565 images.
EnableTrueColor, that enables RGBA32 blitting by adding a second surface and patching the Blt function to produce RGBA32 colors.

WizzardMaker · 2020-09-14T14:48:10Z

I thought about that. Having a second surface to draw on would mean that we must draw all objects onto that new surface (we cannot have some on the original surface and some on the new one). We cannot blend the old and new surfaces later since it will break the z order of objects.

Thing is, the blit function has the gfx data and palette, so its rather easy to just create that texture ourself, with the exporter methods.
So we can just draw all of the images to the surface, both our new ones marked with the id in the beginning and the original images

Do you know how the black view range/radius is drawn? Is it just another surface, that gets drawn on top of all objects?

nyfrk · 2020-09-14T15:36:56Z

Thing is, the blit function has the gfx data and palette, so its rather easy to just create that texture ourself, with the exporter methods.
So we can just draw all of the images to the surface, both our new ones marked with the id in the beginning and the original images

I would not cut that down. My suggestion is that, instead of you doing the blitting yourself in your plugin, we would create a Blt method in the S4ModApi. You would still get the data of the original GFX file (and all the other arguments) but you would call s4->BltObjectRGBA(rgbaImage,...) to get it drawn onto the games surface. If the library does that, we can ensure compatibility with other mods that add graphics to the game that do not yet exist. The S4ModApi would basically hide the fact that it swaps the games surface to a RGBA32 surface (thus allow for some abstraction).

The question is now: How would the argument list of the s4->BltObjectRGBA look like? How would we want to add team colors or the different fogged versions of the sprites? I guess that we don't want to create 8 versions of a settler and for each of these versions 7 versions of darkened versions. So we must somehow mask them or allow for passing a lambda function that does the darkening. Then the next question is how would we want to customize the "growing building" animation? That is handled by the original Blt function too.

Do you know how the black view range/radius is drawn? Is it just another surface, that gets drawn on top of all objects?

I assume that the palettes are layed out in a way that allow to darken the colors when decrementing the color. The game just decrements each pixel by a certain amount to draw a darker image. When you blit your RGBA images, you would probably decrement each channel of each pixel by a certain amount to achieve a similar effect.

WizzardMaker · 2020-09-14T17:47:52Z

You would still get the data of the original GFX file (and all the other arguments) but you would call
s4- >BltObjectRGBA(rgbaImage,...) to get it drawn onto the games surface.

But how would you achieve z ordering?

The game should execute the Blt function already in the correct order.

Like you said. The game executes these commands in a specific order.

How would we want to add team colors or the different fogged versions of the sprites?

We could use a two layer method. Have a unit texture without the team coloring and a second one with just the team coloring which will be tinted to the correct team.

When you blit your RGBA images, you would probably decrement each channel of each pixel by a certain amount to achieve a similar effect

The question was rather, how does the game know, when to tint the pixel dark? There has to be some sort of array storing that information.

nyfrk · 2020-09-14T18:57:02Z

But how would you achieve z ordering?

The plugins would still have to register a Blt listener and call the custom blt method from there thus ensuring the correct order. The plugin code could look something like this:

s4->EnableTrueColor(); // enable true color mode for object surface
s4->AddBltListener([](DWORD _0, DWORD _1, DWORD _2, DWORD _3, DWORD _4, DWORD _5, DWORD _6, DWORD _7, DWORD _8, BOOL discard, DWORD caller)->BOOL {
   WORD* gfx = (WORD *)_1;
   if (gfx[0] ^ gfx[1] == 0x1337) { // check if gfx is watermarked
       auto filename = "image" + gfx[0] + ".bmp";
       auto pixels = LoadARGBFromFile( filename );
       s4->BltObjectRGBA( pixels, ... ); // blit using RGBA image
       return 1; // skip original blitting
  } else {
       return 0; // non-watermarked sprites are drawn by the game
  }
});

The arguments of the callback will be replaced by a single argument that gives a pointer to a struct that contains all the current arguments. This way plugins can alter the arguments and as a bonus the loop can be more efficient since we do not have to repush all arguments onto the stack each time we iterate the observers.

We could use a two layer method. Have a unit texture without the team coloring and a second one with just the team coloring which will be tinted to the correct team.

So essentially you would use some kind of mask and change the hue similar to how one would do it in Photoshop when using a mask to select the area to work with. I am not sure how good this can be quality-wise. It would probably be easier if we would just use the S4GFX tool to export 8 versions of each sprite (one per team color). This would allow for maximum of customization and be probably easier for us (at the expense of memory usage of course).

Edit: Quality problems can be solved when using a palettized mask image and then providing 8 different palettes.

The question was rather, how does the game know, when to tint the pixel dark? There has to be some sort of array storing that information.

This information is stored in the terrain array. It is an array of DWORDs. The 1st byte is the terrain type (grass, sand, etc), the 2nd is the height and the 4th is related to the fow (I think the fow level was the 3 least significant bits of this 4th byte). I think the fow of the currently blitted world position is already in one of the static variables. So it shouldn't be too hard to figure that out.

WizzardMaker · 2020-09-14T23:35:16Z

Problem is with adding new units currently is, how would you add them? There is no really easy way of adding any new units to the .gfx files, as the direction and job index list system is rather complicated. So there won't be any way of hacking the first few bytes of the image to point to our external .png file.

The only way I could see is to hack existing units to facilitate custom units and then somehow identify them later in the blit function.

So essentially you would use some kind of mask and change the hue similar to how one would do it in Photoshop when using a mask to select the area to work with. I am not sure how good this can be quality-wise. It would probably be easier if we would just use the S4GFX tool to export 8 versions of each sprite (one per team color). This would allow for maximum of customization and be probably easier for us (at the expense of memory usage of course).

Yeah, thats what I meant. But I think it would be easier to just have 8 versions of the units, though that would 8 fold our space consumption. I'd need to check how much more space is really consumed in the end, when we're packing everything

nyfrk · 2020-09-15T08:31:50Z

Problem is with adding new units currently is, how would you add them?

That is not a problem the HD patch must solve but I would add new objects by expanding the switch in the RenderObject function of the game. The game will try to blit the first image in the gfx archive which is usually a black placeholder if an identifier is unknown so we just hook that and instead draw a the correct image. For the logic it would be just another instance of the CSettler or CBuilding class in the object pool.

Yeah, thats what I meant. But I think it would be easier to just have 8 versions of the units, though that would 8 fold our space consumption. I'd need to check how much more space is really consumed in the end, when we're packing everything.

To reduce memory usage we could create a blitting function that directly blits from compressed images. After all we have to write our own blitter anyway so this would probably be a reasonable choice.

Could you add an export mode that allows us to output for each sprite 8 colored versions as premultiplied RGBA32 bitmaps with 3 times the resolution of the original? For images that do not use team colors (like trees, animals etc) there would be just one version of course. Then we could see about how many Gigabytes we are talking. We could multiply this by 4/3 to estimate the cost of mip maps too.

The reason we should choose 3 times the original resolution is that the game allows only scales images up to 3 times. More resoultion would be pointless unless we hack the game to allow more zoom.

Edit: With 3 times the resolution I mean that a 2x2 pixel large sprite should become 6x6 pixels large.

WizzardMaker · 2020-09-15T10:10:33Z

I'm gonna update the exporter to apply the team colors

3 times the resolution of the original?

Isn't 3x a bit overkill?

2x can already result in big file sizes. you have to consider, that we are talking about 18000 sprites per tribe that x8 x3 is A LOT. The buildings are at 2x resolution saved as png already at 213 MiB.

nyfrk · 2020-09-15T10:15:01Z

Isn't 3x a bit overkill?

2x can already result in big file sizes. you have to consider, that we are talking about 18000 sprites per tribe that x8 x3 is A LOT. The buildings are at 2x resolution saved as png already at 213 MiB.

Ok, lets try 2x first.

WizzardMaker · 2020-09-16T22:15:00Z

I've exported and scaled every texture in 20.gfx. With 2x scaling, billinear, we get around 43MiB +- some MiBs for more pixel information, when AI upscaling. So with a full lobby we would see around 350MiB (43*8 assuming all other tribes roughly share the same space requirements) in memory usage just for the units, plus the requirement for the buildings, but that would only be around 860 MiB at max when loading every tribe and every building at once.

Question is, do we just load the whole texture container once when needed, or do we load only the textures on demand (Probably the whole image group containing all the animation textures) which could result in lag, as reading from files is rather slow, even if we cache the textures for a while.

If we go with just loading the whole file, we should probably make the game large address aware, so that we don't run out of x86 memory

nyfrk · 2020-09-17T08:07:27Z

I've exported and scaled every texture in 20.gfx. With 2x scaling, billinear, we get around 43MiB +- some MiBs for more pixel information, when AI upscaling. So with a full lobby we would see around 350MiB (43*8 assuming all other tribes roughly share the same space requirements) in memory usage just for the units, plus the requirement for the buildings, but that would only be around 860 MiB at max when loading every tribe and every building at once.

Is this using RGBA32 bmp's or png's? png's are compressed and must be uncompressed when blitting.

Furthermore we should consider whether to use mip maps. Otherwise we could get ugly effects due to the scaled blitting when zooming out.

Question is, do we just load the whole texture container once when needed, or do we load only the textures on demand (Probably the whole image group containing all the animation textures) which could result in lag, as reading from files is rather slow, even if we cache the textures for a while.

Memory mapping the container would probably be the easiest and fastest solution. At least that lets windows handle all the page swapping for us.

If we go with just loading the whole file, we should probably make the game large address aware, so that we don't run out of x86 memory

True, but we cannot set the laa flag for DLLs. (Well we can, but it has no effect since the flag on the main executable determines whether the entire process is laa). So we would have to set the flag for the S4_Main.exe. I am not sure how well the game operates with negative memory pointers. I think the game sometimes uses the sign bit to check whether a pointer is valid. So this must be carefully tested.

WizzardMaker · 2020-09-17T08:15:14Z

Is this using RGBA32 bmp's or png's? png's are compressed and must be uncompressed when blitting.

Using pngs. But the compression isn't that high, as they are rather small files to begin with. We could save a few bytes if we save the files as raw color streams, but with run length encoding. RLE is rather fast to decode and can save a bit of space.

Furthermore we should consider whether to use mip maps. Otherwise we could get ugly effects due to the scaled blitting when zooming out.

I don't really know of mip maps are that necessary, if we use a scaling algorithm like bilinear or trillinear, when blitting. If we use a custom surface, we could just use hardware accelerated direct2d, which gives us these options when blitting

nyfrk · 2020-09-17T15:22:38Z

Using pngs. But the compression isn't that high, as they are rather small files to begin with. We could save a few bytes if we save the files as raw color streams, but with run length encoding. RLE is rather fast to decode and can save a bit of space.

Lets start simple. I propose that we just concatenate all PNGs into one big file. Then we will using libpng to load and blit them onto the surface and hope that it will be fast enough. Converting them as RGBA32 bitmaps will definitely blow our memory usage.

I don't really know of mip maps are that necessary, if we use a scaling algorithm like bilinear or trillinear, when blitting. If we use a custom surface, we could just use hardware accelerated direct2d, which gives us these options when blitting

I am not sure about that. Hardware accelerated blitting will probably be difficult since we need extra features like "growing buildings" or fog of war fading. I would create a software solution first (similar to the original function).

WizzardMaker · 2020-10-01T17:49:31Z

How is the progress on the renderer @nyfrk?

nyfrk · 2020-10-01T20:25:42Z

I am sorry. I currently do not have much free time. I will continue working after the 5th of november on open source projects.

nyfrk · 2020-10-16T22:30:00Z

I did some work today.

I ditched the idea of using libpng and rather used Gdiplus. Gdiplus comes with rgb565 support by default, does support more formats (png, jpeg, etc) and provides nice blitting features (like polyline clipping for the erecting building animation).

The width and height of an image to render is determined by a table lookup. The game manages a table that is modified whenever the player changes the zoom. It maps widths/heights of a gfx to a width/height to render. I will add the table to the API too. If you don't want to wait for the API, you can use INT32* GfxZoomTable = *(INT32**)(S4_Main + 0x10587A8); for now.

Looks like some graphics are already rendered in high resolution. When fully zoomed in, thin objects like grass are displayed almost 1:1 (high resolution). Whereas the chicken is upscaled (low resolution).

We could use the flag to feed the game higher resolution images. Though that would still limit us to the palette.

One annoyance with hooking the Blt function is that we have no way to determine the fog of war level at the currently rendered object since the fog is already computed beforehand by passing an adjusted (darkened) palette. We cannot make a quick terrain lookup since we do not know the x and y position of the object on the map (at least we dont know it in the hook procedure). Fortunately the game is very optimized and only knows one palette for any fogged object (despite there being multiple fow levels or different objects/team objects). So we can simply check the palette whether it is the fow palette and if it is, we draw the sprite slightly darkened.

Here is a demo that renders colored boxes behind objects. (Later we would remove them and blit real images). Fogged sprites are rendered with a dark red box.

https://youtu.be/Sw5nKBgsZ54

It looks good so far but there are still some issues we must fix:

The erecting building animation is not implemented yet
The tiny camera in the side panel apparently needs a scanline fix
And of course we need to incorporate a rgb888 surface

WizzardMaker · 2020-10-16T22:38:53Z

That is good news!

I implemented the identification of the gfx files today aswell. It's more experimental, so if you need the feature i'd send you the application to apply it to a gfx file

How does the game handle the building animation? There has to be some kind of info tied to the object, indicating the progress, isn't there?

nyfrk · 2020-10-16T22:41:07Z

The game uses argument _6 for this. _6 is the current y position of the zigzag clipper.

WizzardMaker · 2020-10-16T22:46:46Z

Oh

Well, I don't know how we could create that zig zag procedurally in code, but we could just create a mask texture and draw the building texture with the mask applied. That would fix that

nyfrk · 2020-10-16T22:49:51Z

I think i would first try the region clipping method of Gdiplus.

WizzardMaker · 2020-10-16T22:53:11Z

Yes, absolutely. I was rather thinking about a way to preserve the old way of building, for the future

We could also use different images for steps of the building process, but I think I am getting ahead of myself ^^

nyfrk · 2020-10-16T22:56:00Z

The nice thing about using a region clipper is that we should be able to translate it just using argument _6.

We could also use different images for steps of the building process, but I think I am getting ahead of myself ^^

This is definitely possible. We could calculate the percent of the construction and just change the image whenever we want.

Edit: I tested that and it is not that easy. The "clipping mode" is only used when a part of the zigzag pattern is actually visible. If it is not, we cannot distinguish it from a non-built (zigzag pattern is below bottom edge of the screen) or already built (zigzag pattern is above top edge of the screen) building.

Furthermore the argument _6 is capped such that it is never below the bottom edge of the screen. That means, we cannot do the percent calculation when the zigzack pattern is not visible.

We cannot use different images for different states, unless we change how the game calls the Blt function. However, we can still change the zigzag pattern to something different like triangles, sine or any other shape that fits within the 30 pixels height.

The transparent-blue is the clipping region.

WizzardMaker · 2020-10-17T07:00:56Z

Well the already built building has a different id altogether, so that wouldn't really matter

So until we could maybe get the actual progress of the construction, the way the object itself handles it, we have to stay with the clipped version

Is there a way to get the corresponding object of what we draw? Or just a unique identifier of the object that's being drawn

nyfrk · 2020-10-17T14:52:33Z

The actual progress is stored in the CBuilding object in the object pool.

We could get a reference to the rendered object by hooking these functions. Or we could just try to get a reference by stack hacking.

There is actually two identifiers that are of interest:

The identifier of the corresponding object in the object pool (that one that stores health, world position, building type etc).
The identifier for the sprite of the object to draw. Objects are animated and have therefore multiple sprites.

Both of them could be somewhere on the stack when the Blt method is called.

WizzardMaker · 2020-10-22T17:20:23Z

The identification of buildings is now in-game using Direct2D:

14.gfx, image Nr. 3

Now we only need a way to either render to the surface, a way to get the HDC of the surface, or a way to render to the window at the right time in the render process.

nyfrk · 2020-10-23T02:22:39Z

Now we only need a way to either render to the surface, a way to get the HDC of the surface, or a way to render to the window at the right time in the render process.

HDC would be nice since it would abstract from the color depth. The problem is that I don't know a good and fast way to create/manage an HDC for that purpose. You will probably have to construct your Gdiplus::Graphics object yourself from the raw arguments. This seems to be fast (at least acceptable), but it does not abstract from the color depth and constructing it for each object to render is kind of (unnecessary) slow. I am not sure if it is a good idea to replace the HDC with a Pointer to a Gdiplus::Graphics Object (is it safe to share Gdiplus::Graphics across libraries?). That would resolve the speed and abstraction problem though.

So far I prepared the following struct for you. It will be the argument the S4BltHook will provide you:

typedef struct S4BltParams {
	VOID* caller;
	WORD* imagePalette;
	BYTE* imageData;
	INT imageWidth;
	INT imageHeight;
	INT destX;
	INT destY;
	INT destClippingOffsetY;
	WORD* subSurface;
	BOOL imageHighRes;
	INT destWidth;
	INT destHeight;
	INT surfaceWidth;
	INT surfaceHeight;
	INT stride;
	DWORD zoomFactor;
	WORD* surface;
	BOOL isFogOfWar;
	WORD settlerId; // works but not in all cases yet
	WORD spriteId; // not implemented yet
	HDC destinationDc; // not implemented yet, unsure if replaced by Gdiplus::Graphics*
} *LPS4BLTPARAMS;

WizzardMaker · 2020-10-23T06:25:13Z

Can't we get the HDC, when the game renders the finished surface?
We can try to emulate the function and draw the original buildings with the palette data. I know how the game creates the texture data. We could just hook the call, draw to our own surface and then draw that to the games primary surface.

EDIT:

Is there a way to get the length of WORD* imagePalette; BYTE* imageData;? I presume, that imagePalette is probably max 255 long, but does the renderer know how long imageData is, or does it read until it hits the end bytes of an image?

nyfrk · 2020-10-24T11:58:36Z

We would have to create another hook such that we can blit our surface just before the UI is rendered. This could indeed work out. Do you already have some offset to hook for that?

Is there a way to get the length of WORD* imagePalette; BYTE* imageData;? I presume, that imagePalette is probably max 255 long, but does the renderer know how long imageData is, or does it read until it hits the end bytes of an image?

The palette is always 256 colors large. The imageData is accompanied with imageWidth which is the width of the gfx image and imageHeight for the height. destWidth/destHeight is the width in pixels on the screen (scaled according to current zoom).

WizzardMaker · 2020-10-24T12:09:04Z

Do you already have some offset to hook for that?

No, I couldn't find one yet

nyfrk · 2020-11-29T18:39:08Z

I just pushed my latest efforts to the S4ModApi project. You should be able to render high resolution sprites using Gdiplus. Here is a snipped that I used to render the boxes. You should be able to alter it to render the appropriate sprites.

I will add true color support later. However you could probably also just blit the sprites to the final surface using the Frame Hook for true color sprites.

WizzardMaker added the enhancement New feature or request label Sep 8, 2020

WizzardMaker self-assigned this Sep 8, 2020

Switching to custom DDraw texture rendering #11

Switching to custom DDraw texture rendering #11

Comments

WizzardMaker commented Sep 8, 2020

nyfrk commented Sep 9, 2020

WizzardMaker commented Sep 9, 2020

nyfrk commented Sep 9, 2020

WizzardMaker commented Sep 9, 2020

nyfrk commented Sep 9, 2020 • edited Loading

WizzardMaker commented Sep 9, 2020 • edited Loading

nyfrk commented Sep 9, 2020

WizzardMaker commented Sep 9, 2020 • edited Loading

WizzardMaker commented Sep 9, 2020

nyfrk commented Sep 9, 2020

WizzardMaker commented Sep 10, 2020

WizzardMaker commented Sep 10, 2020

nyfrk commented Sep 10, 2020

WizzardMaker commented Sep 10, 2020 • edited Loading

nyfrk commented Sep 11, 2020 • edited Loading

WizzardMaker commented Sep 11, 2020 • edited Loading

WizzardMaker commented Sep 11, 2020

WizzardMaker commented Sep 11, 2020 • edited Loading

nyfrk commented Sep 12, 2020

WizzardMaker commented Sep 12, 2020

nyfrk commented Sep 12, 2020

WizzardMaker commented Sep 12, 2020

WizzardMaker commented Sep 12, 2020

nyfrk commented Sep 12, 2020 • edited Loading

WizzardMaker commented Sep 12, 2020

WizzardMaker commented Sep 12, 2020 • edited Loading

WizzardMaker commented Sep 12, 2020

WizzardMaker commented Sep 12, 2020 • edited Loading

nyfrk commented Sep 13, 2020

WizzardMaker commented Sep 14, 2020

nyfrk commented Sep 14, 2020

WizzardMaker commented Sep 14, 2020

nyfrk commented Sep 14, 2020 • edited Loading

WizzardMaker commented Sep 14, 2020 • edited Loading

nyfrk commented Sep 15, 2020 • edited Loading

WizzardMaker commented Sep 15, 2020

nyfrk commented Sep 15, 2020

WizzardMaker commented Sep 16, 2020 • edited Loading

nyfrk commented Sep 17, 2020

WizzardMaker commented Sep 17, 2020 • edited Loading

nyfrk commented Sep 17, 2020

WizzardMaker commented Oct 1, 2020

nyfrk commented Oct 1, 2020

nyfrk commented Oct 16, 2020

WizzardMaker commented Oct 16, 2020

nyfrk commented Oct 16, 2020

WizzardMaker commented Oct 16, 2020

nyfrk commented Oct 16, 2020

WizzardMaker commented Oct 16, 2020

nyfrk commented Oct 16, 2020 • edited Loading

WizzardMaker commented Oct 17, 2020

nyfrk commented Oct 17, 2020

WizzardMaker commented Oct 22, 2020

nyfrk commented Oct 23, 2020

WizzardMaker commented Oct 23, 2020 • edited Loading

nyfrk commented Oct 24, 2020

WizzardMaker commented Oct 24, 2020 • edited Loading

nyfrk commented Nov 29, 2020

nyfrk commented Sep 9, 2020 •

edited

Loading

WizzardMaker commented Sep 9, 2020 •

edited

Loading

WizzardMaker commented Sep 9, 2020 •

edited

Loading

WizzardMaker commented Sep 10, 2020 •

edited

Loading

nyfrk commented Sep 11, 2020 •

edited

Loading

WizzardMaker commented Sep 11, 2020 •

edited

Loading

WizzardMaker commented Sep 11, 2020 •

edited

Loading

nyfrk commented Sep 12, 2020 •

edited

Loading

WizzardMaker commented Sep 12, 2020 •

edited

Loading

WizzardMaker commented Sep 12, 2020 •

edited

Loading

nyfrk commented Sep 14, 2020 •

edited

Loading

WizzardMaker commented Sep 14, 2020 •

edited

Loading

nyfrk commented Sep 15, 2020 •

edited

Loading

WizzardMaker commented Sep 16, 2020 •

edited

Loading

WizzardMaker commented Sep 17, 2020 •

edited

Loading

nyfrk commented Oct 16, 2020 •

edited

Loading

WizzardMaker commented Oct 23, 2020 •

edited

Loading

WizzardMaker commented Oct 24, 2020 •

edited

Loading