来自C和C++,我最近开始学习x86-64汇编,以更好地理解我的程序的工作原理。
我知道x64汇编中的约定是在调用函数之前在堆栈上保留32字节的“影子存储区”(通过执行以下操作:subq $0x20, %rsp
)的数据。
我不确定的是:是被调用方负责再次递增%rsp
,还是调用方负责?
换句话说(以printf
为例),数字1或数字2是正确的吗(或者两者都不是:P)?
1.
subq $0x20, %rsp
movabsq $msg, %rcx
callq printf
subq $0x20, %rsp
movabsq $msg, %rcx
callq printf
addq $0x20, %rsp
(...其中msg
是存储在.data
部分中的ascii字符串,我将把它传递给printf
)
我在Windows 10上,使用GAS作为我的汇编程序。
任何帮助都将不胜感激,干杯。
1条答案
按热度按时间fslejnso1#
Deallocating shadow space is the caller's responsibility.
But normally you'd do it once per function, not once per call-site within a function. Usually you just move RSP once (maybe after some pushes) and leave it alone until you're ready to return. That includes making room to store stack args if any for functions with more than 4 args.
In the Windows x64 calling convention (and x86-64 System V), the callee must return without changing the caller's RSP. i.e. with
ret
, notret 32
, and without having copied the return address somewhere else.MS has some examples in https://learn.microsoft.com/en-us/cpp/build/prolog-and-epilog?view=msvc-170#epilog-code
And specifically documents that RSP mustn't be changed by functions:
The x64 ABI considers registers RBX, RBP, RDI, RSI, RSP, R12, R13, R14, R15, and XMM6-XMM15 nonvolatile. They must be saved and restored by a function that uses them.
(You also need to emit unwind metadata for every instruction that moves the stack pointer, and about where you saved non-volatile aka call-preserved registers, if you want to be fully compliant with the ABI, including for SEH and C++ exception unwinding. Toy programs still work fine without, as long as you don't expect C++ exceptions to work, or debuggers to unwind the stack back to the stack frame of a caller.)
You can see this if you look at MSVC compiler output, e.g. https://godbolt.org/z/xh38jxWqT , or for AT&T syntax,
gcc -O2 -mabi=ms
to tell it that all the functions it sees are__attribute__((ms_abi))
by default, but it doesn't override the fact that it's targeting Linux. So with-fPIE
to make it use LEA instead of 32-bit absolute addressing for symbol addresses, we also getcall printf@plt
, not Windows style calls to DLL functions.But the stack management from GCC matches what MSVC -O2 also does.
See also How to remove "noise" from GCC/clang assembly output? for more about looking at compiler output - you can answer most questions about how things normally work by looking at what compilers do in practice. Sometimes things compilers do are just a coincidence, especially with optimization disabled (which is why I constructed an example that couldn't inline the functions, so I could still see the calls with optimization enabled). But here we can rule out your alternate hypothesis.
I also constructed this example to show two calls using the same allocation of shadow space, not pointlessly deallocating / reallocating with add/sub. Even with optimization disabled, compilers don't do that.
Re: putting symbol addresses into registers, see How to load address of function or label into register - RIP-relative LEA is the go-to option. It's position-independent, and works in any executable or library smaller than 2GiB of static code+data. And more efficient than
movabs
.