Need info about scopes...

CommonTater · December 18, 2012, 06:37:16 PM

Quote from: frankie on December 18, 2012, 05:04:44 PM
Tater if you defined large structures local to scopes inside the same function they are created on the stack.

Well, the structs themeselves are in the range of 20 to 32k ... big yes, but the stack should be able to deal with it if the space is either re-used or released at the end of each scope. The problem might be that the overall program is deeply parenthetical, with function nesting getting quite deep in some points... so maybe there's not enough stack to begin with.

Quote
Your problem can be a stack corruption if you write beyond the structure (bad pointer), a stack overflow if data structures are too large or optimization problem.
Knowing you I am sure that you already recompiled everything with no optimization

Actually, its a bit weird... turning off optimizations crashed the function sooner....

Quote
Remains the other two options.
If you are sure that there is no pointers error, try to increase stack allocation using the switch /STACK on the linker.

I'm pretty sure there's no overruns or pointers to non-existent stuff... and I will try the /STACK switch and maybe bump it up to 2mb instead of the default 1...

CommonTater · December 18, 2012, 06:57:11 PM

Quote from: frankie on December 18, 2012, 05:55:50 PM
The stack frame is created *only* for function call.
Local variables are created and destroyed inside the *current* frame.
EBP register points always to function call frame creation point, this is consistent with stack unroll of exception handler (slightly modifyied for 64bits programs).
No frame is created and never will be to avoid any mess with stack handling.
If any stack extension is required for local variables, scoped or not, is handled through use of alloca() function.

Actually, from what I was reading earlier this morning there's a second reason why a stack frame cannot be created for each scope... apparently the function parameters and internal variables are often referenced as offsets from EBP... [ebp + 8] and such. To create a second stack frame would mess up the location of ebp and cause it to temporarily lose track of the function's (more or less) global data.

I wish I'd known all this 8 years ago... It would have changed the way I write C programs rather significantly... That'll teach me to trust text books! (In Pascal this was never an issue, far as I know)

I'm still trying to get my head around how the stack is handled with respect to scopes...

It appears that with optimizations off, every variable gets it's own chunk of stack which was what seems to have caused my overflow problem.

With any optimization turned on, it appears to re-use a "big as the biggest" chunk of stack to manipulate variables... which, if I'm understanding correctly, means that when a variable goes out of scope it's space is re-used as a means of reducing overall stack usage.

As you know, there have been times when the optimization bugs have caused me some grief, so I typically compose and test software with optimizations off then I will turn on the most appropriate optimization (usually "maximize speed") and test again...

In this case enabling optimizations actually corrected the problem... It remains for more extensive testing but it seems that I need the stack optimizations on to make this work. Maybe the answer is in this and making a larger stack through the linker...

I wonder if Pelle would consider making the Stack Optimizations available on a separate switch... so they can be tested independently of the other strategies.

jj2007 · December 18, 2012, 07:01:55 PM

Quote from: frankie on December 18, 2012, 05:55:50 PM
The stack frame is created *only* for function call.
...
No frame is created...

You mean "no frame is created for code inside brackets", I suppose?

Quote from: CommonTater on December 18, 2012, 02:21:47 PM
...programs with a large structs, where I've used nested scopes in hope of avoiding stack overflows, believing the variables in the inner scopes were released from the stack at the closing brace.

Thing is the books I learned from both said that variables are released from memory at the end of each scope... now, years later, I'm discovering this isn't always true...

What is "released"? It can be
- GlobalFree
- leave (i.e. destroy the stack frame)
- leave everything "as is" but declare the memory as part of another bit of code.

Again, with Olly that could be clarified in seconds.

CommonTater · December 18, 2012, 07:15:18 PM

Quote from: jj2007 on December 18, 2012, 07:01:55 PM
QuoteThing is the books I learned from both said that variables are released from memory at the end of each scope... now, years later, I'm discovering this isn't always true...

What is "released"? It can be
- GlobalFree
- leave (i.e. destroy the stack frame)
- leave everything "as is" but declare the memory as part of another bit of code.

In this case "released" pretty much means the destruction of the stack frame at the end of the function, which is really just resetting EBP and the stack pointer back to where they were when the function was called. (i.e. To point at the previous stack frame)

Quote
Again, with Olly that could be clarified in seconds.

Ok... so write some C code and experiment with it in Ollydbg ... which is exactly what I've been doing throught this whole fiasco...

jj2007 · December 18, 2012, 07:35:19 PM

Quote from: CommonTater on December 18, 2012, 07:15:18 PM
Ok... so write some C code and experiment with it in Ollydbg ... which is exactly what I've been doing throught this whole fiasco...

What I meant is you compile your snippet with your optimised and not-so-optimised compiler settings and post the two executables here. And no, I am not asking that because I am a mean old man or unable to compile a C snippet but rather because your executable fails, and it has to do with your compiler (settings)
;-)

frankie · December 18, 2012, 07:45:53 PM

Yes Tater every dynamic allocation is relative addressed with reference to base stack pointer which is EBP.
If you take a look to assembler generated you will see a lot of move in some register the value from location pointed by EBP plus an offset like "mov eax, [EBP+0x12]".
That's why you have a prolog where you save the actual stack pointer in EBP, that will be base frame, than some value will be added to stack pointer to create space on the stack. Because for IAPX architecture the stack grows toward lower addresses, we must *subtract* from stack pointer the space we need. The subtracted value must leave the stack aligned on natural size (on DWORD boundary for 32bits and QWORD boundary for 64). This way anything that will be pushed expressily (like a push instruction), or implicitly (like the return address of a subroutine call) will lay beyond that point.
The value added to the stack pointer is the space required to store local or better dynamic variables. The parameter passed to the function are above (for IAPX) the base pointer EBP, so tipically you access local variables with positive offsets referred to base pointer and negative offsets for parameters.
The function alloca() can squeeze or enlarge the local variables area (between EBP and actual stack pointer ESP) in some cases when the requested space could not be determined statically during compilation (the compiler calculates the amount required for local variables and hardcodes the value to add to stack in prologue) is dinamically chnged by alloca().
The use of this function largely discouraged for the problems that can cause considering it dangerous (this is common opinion around), but the real reason is that the abuse of such a practice generally lead to stack overflow problems.
From what you said the stack overflow could be the culprite, anyway I suggest to enlarge stack at least 4M. Consider that the stack can grow by itself, the OS should dinamically increase it on request, but because it have to cause a memory exception to trigger the virtual memory manager to allocate more data, maybe that on very large structures your access is much far away leading to memory violation before memory reallocation. This means that the standard 4k (1 page) of reallocation could be insufficient (is the second parameter of /STACK switch), so maybe you want allocate 4 or even 8 pages at time.
This depend on your code, consider that enlarging the stack will erode memory resources that will remain allocated in your program, enlarging the allocation size could fit the request determining almost automatically the memory you need.
Last consider that the stack memory on program entry is allocated, but not committed. This means that the memory manager take a note of how much memory you require, but the real allocation, committment, is done when you *access* memory (or try to do it).

JJ yes, the creation of stack frame is *only* for functions. The scoping determine the visibility of a variable, but the memory allocation is the same along the whole function. Using optimizations the compiler could reuse memory, but this is not limiting it can do whatever it considers usefull to reduce resources (CPU, memory, time).

EDIT: I correct some incongruences, and attach an image to better explain the mechanism.
For 64bits the thing is a little bit more complex due to the calling convention (__fastcall) used, but the base working of stack handling remains the same.

jj2007 · December 18, 2012, 08:10:43 PM

Here is Olly's output for Pelles C with standard Win32 console settings:
main Ú$ 55 push ebp
00401001 ³. 89E5 mov ebp, esp ; create the stack frame
00401003 ³. 81EC 08010000 sub esp, 108 ; reserve 108h bytes, i.e. SIZEOF the bigger structure
00401009 ³. 57 push edi
0040100A ³. 68 08704000 push offset 00407008 ; ÚArg1 = ASCII "Hello"
0040100F ³. E8 5C000000 call puts ; ÀSimpleConsole.puts
00401014 ³. 59 pop ecx
00401015 ³. 8DBD F8FEFFFF lea edi, [local.66] ; get the address of first (bigger) struct
0040101B ³. 31C0 xor eax, eax
0040101D ³. B9 42000000 mov ecx, 42
00401022 ³. F3:AB rep stosd ; and clear it
00401024 ³. 8D85 F8FEFFFF lea eax, [local.66]
0040102A ³. 50 push eax ; ÚArg2
0040102B ³. 68 04704000 push offset 00407004 ; ³Arg1 = ASCII "%p
"
00401030 ³. E8 CB000000 call printf ; ÀSimpleConsole.printf
00401035 ³. 83C4 08 add esp, 8
00401038 ³. 8DBD FCFEFFFF lea edi, [local.65] ; get the address of the second struct
0040103E ³. 31C0 xor eax, eax
00401040 ³. B9 41000000 mov ecx, 41
00401045 ³. F3:AB rep stosd ; and clear it
00401047 ³. 8D85 FCFEFFFF lea eax, [local.65]
0040104D ³. 50 push eax ; ÚArg2
0040104E ³. 68 04704000 push offset 00407004 ; ³Arg1 = ASCII "%p
"
00401053 ³. E8 A8000000 call printf ; ÀSimpleConsole.printf
00401058 ³. 83C4 08 add esp, 8
0040105B ³. 68 00704000 push offset 00407000 ; ÚArg1 = ASCII "bye"
00401060 ³. E8 0B000000 call puts ; ÀSimpleConsole.puts
00401065 ³. 59 pop ecx
00401066 ³. 31C0 xor eax, eax
00401068 ³. 5F pop edi
00401069 ³. 89EC mov esp, ebp ; release the frame
0040106B ³. 5D pop ebp
0040106C À. C3 retn

It's all very straightforward, nothing mysterious. And it shouldn't fail, so your problems are elsewhere.

CommonTater · December 18, 2012, 10:06:02 PM

Quote from: frankie on December 18, 2012, 07:45:53 PM
Yes Tater every dynamic allocation is relative addressed with reference to base stack pointer which is EBP.
If you take a look to assembler generated you will see a lot of move in some register the value from location pointed by EBP plus an offset like "mov eax, [EBP+0x12]".

Yep... read about that, then once alerted I saw it in the dissassembly...

Quote
The value added to the stack pointer is the space required to store local or better dynamic variables. The parameter passed to the function are below the base pointer EBP, so tipically you access local variables with positive offsets referred to base pointer and negative offsets for parameters.

Ahhh... ok, good to know...

Quote
The function alloca() can squeeze or enlarge the local variables area (between EBP and actual stack pointer ESP) in some cases when the requested space could not be determined statically during compilation (the compiler calculates the amount required for local variables and hardcodes the value to add to stack in prologue) is dinamically chnged by alloca().
The use of this function largely discouraged for the problems that can cause considering it dangerous (this is common opinion around), but the real reason is that the abuse of such a practice generally lead to stack overflow problems.

Which is almost certainly what's going on here... There is one condition in the program I was working on when this happened that can send it off into a recursive folder search to find the file it needs to open. This can really suck up the stack space and it was after that it would crash... I suspect because the huge number of pointer variables caused an overflow condition that didn't manifest itself until the structs were instantiated later.

Quote
From what you said the stack overflow could be the culprite, anyway I suggest to enlarge stack at least 4M. Consider that the stack can grow by itself, the OS should dinamically increase it on request, but because it have to cause a memory exception to trigger the virtual memory manager to allocate more data, maybe that on very large structures your access is much far away leading to memory violation before memory reallocation.

I increased it to 2097152 : 8192 over the course of several tries and it seems OK now. I won't know for certain until I do some further testing but this is promising...

With this new stuff in my head... doing a code review suggested that the folder search would best be relocated to it's own function, using heap allocation rather than stack space. (I rarely make functions that are only called from one place.) Fortunately it's the only part of that monster function that only needs two parameters... the start path and the filename.

I will be reorganizing it sometime this evening or tomorrow I'll let you know how it goes...

CommonTater · December 18, 2012, 10:07:25 PM

Quote from: jj2007 on December 18, 2012, 08:10:43 PM
It's all very straightforward, nothing mysterious. And it shouldn't fail, so your problems are elsewhere.

Yet, strangely enough... using Frankie's advice --increasing the stack-- and a bit of background research seems to have me on the way to fixing the problem.

Yes, it was a stack overflow. (Exception code : 0x000003E9)

jj2007 · December 18, 2012, 10:36:42 PM

Quote from: CommonTater on December 18, 2012, 10:06:02 PMThere is one condition in the program I was working on when this happened that can send it off into a recursive folder search to find the file it needs to open. This can really suck up the stack space and it was after that it would crash...

We discussed that elsewhere some time ago, and the conclusion was that a recursive folder search can go max 125 levels deep - with 1MB of stack, you need around 8k to achieve a stack overflow. MAX_PATH is 260 bytes...

CommonTater · December 18, 2012, 10:54:53 PM

Quote from: jj2007 on December 18, 2012, 10:36:42 PM
We discussed that elsewhere some time ago, and the conclusion was that a recursive folder search can go max 125 levels deep - with 1MB of stack, you need around 8k to achieve a stack overflow. MAX_PATH is 260 bytes...

If I agree with you, will you stop trying to help me?

jj2007 · December 18, 2012, 11:33:57 PM

Quote from: CommonTater on December 18, 2012, 10:54:53 PM
If I agree with you, will you stop trying to help me?

Tater, you are a funny guy ;-)

I just wanted to return the generosity you showed in the other forum. But your reaction shows that you are very tired of chasing your bug, and that right now is not a good moment for offering help. Good luck! And Merry Christmas!!

CommonTater · December 19, 2012, 12:01:54 AM

Quote from: jj2007 on December 18, 2012, 11:33:57 PM
But your reaction shows that you are very tired of chasing your bug, and that right now is not a good moment for offering help.

I have the program working... so no more bug.

CommonTater · December 19, 2012, 02:58:46 AM

Ok... moved the folder search to it's own subroutine, used true recursion and used heap allocations for it's structs, made the stack bigger (2mb) and redid a couple of small sections of code...

Ran my monster sub into the deepest nether reagions of my system and everything worked on a dozen tries.

So... in future I'm going to have to change my decision making about a couple of things.

I'm going to have to do a code review on a bunch of other programs I've written...
but I'm pretty sure I'll still be alive at the end of it...

Thanks a bunch guys .... your help is deeply appreciated

frankie · December 19, 2012, 09:14:27 AM

Tater I'm happy you solved the problem

Anyway I should correct my previous description that refers to an architecture with stack that grows toward higher memory addresses. In reality in IAPX it grows toward lower addresses, so more space you reserve more lowers the address.
I wrote it in a rush, but anyway the concept is absolutely correct

Your parameters, with return address and stack pointer original value are on one side of frame pointer EBP, local variables on the other. Later I will correct it, now I'm very busy.

News:

Need info about scopes...

CommonTater

CommonTater

jj2007

CommonTater

jj2007

frankie

jj2007

CommonTater

CommonTater

jj2007

CommonTater

jj2007

CommonTater

CommonTater

frankie