RAD Studio XE2: Win64 exceptions and stack traces on Win64 (JCL/JVCL 64) and OS X

I have been given permission by Embarcadero to talk about RAD Studio XE2 and if you want to know about all the new stuff in RAD Studio XE2 a good source for information is the RAD Studio XE2 World Tour stop near you. For more information and a list of the stops please have a look at the RAD Studio XE2 World Tour page. I will attend too and I guess some of you know where.

Before I do start with the actual topics of this post a few of questions and answers you are probably interested in.

Q: Will the JCL and JVCL do support the Win64 target?
A: YES. We know that the JCL and JVCL are essential to a large part of the Delphi user base and as in the past years we will have a new release shortly after the XE2 release.

Q: Was adding Win64 support to the JCL and JVCL easier than the Unicode transition?
A: YES.

Q: I do use the JCL only because of the stack tracing feature. Will that work for Win64 too?
A: Yes. Read this blog post!

Exceptions

I do have that “Exceptions” in the title of this post and do expect that you know what exceptions are and how they can be handled in Delphi. That will stay the same with XE2 on the high level, but a big and very welcome difference between Win32 and Win64 are the internals of exceptions. The benefit of the differences for you is that there is no speed penalty for try/finally and try/except blocks for Win64 in the case when no exception happens in contrast to Win32.

If you want to check this yourself that try/except slows down your Win32 code then try the following test case.

program TryExceptPerformanceTest;
 
{$APPTYPE CONSOLE}
 
uses
  Windows, SysUtils;
 
function IncNumber(AValue: Integer): Integer;
begin
  Result := AValue + 1;
end;
 
function IncNumberTryExcept(AValue: Integer): Integer;
begin
  try
    Result := AValue + 1;
  except
    Result := 0;
  end;
end;
 
var
  I, V: Integer;
  TickCount, TickCountTryExcept: Cardinal;
begin
  TickCount := GetTickCount;
  V := 0;
  for I := 1 to 100000000 do
    V := IncNumber(V);
  TickCount := GetTickCount - TickCount;
 
  TickCountTryExcept := GetTickCount;
  V := 0;
  for I := 1 to 100000000 do
    V := IncNumberTryExcept(V);
  TickCountTryExcept := GetTickCount - TickCountTryExcept;
 
  WriteLn(Format('%.2f times faster',
    [TickCountTryExcept / TickCount]));
  ReadLn;
end.

If you do wonder why I do not use units like System.Diagnostics or such types like NativeInt – I want that almost all of you can run this test case with their version of Delphi without any modifications. The output of the test case is how much faster the method is without try/except in comparison to the version with try/except. The higher the value is the greater the slowdown due try/except is and values around 1.00 do mean there is no difference. I do see the following output with XE2:

Win32: 1.51 times faster
Win64: 0.98 times faster (sometimes I do also see 1.00)

I am not using absolute values since they depend on the CPU and I am using a XE2 pre-release version. I do use a very first generation Core2 and did you know that the Core2 has macro-op fusion only in 32-bit mode, while the Core i (Nehalem and newer) has macro-op fusion for 64-bit too? You do not know what macro-op fusion is? It is a processor feature that can cause higher performance on existing code.

Why do these try blocks do slow down things for Win32?

This is because for Win32 the compiler generates additional code that gets partly executed in any case and that causes the slowdown. For Win64 the compiler generates no additional code for the try block body – it generates meta data instead that is stored in the PDATA section of the executable. The main structure for that data is in Winapi.Windows.pas

  PImageRuntimeFunctionEntry = ^TImageRuntimeFunctionEntry;
  IMAGE_RUNTIME_FUNCTION_ENTRY = record
    BeginAddress: DWORD;
    EndAddress: DWORD;
    UnwindInfoAddress: DWORD;
  end;

More information about the structure for the unwind data and what happens/which procedure is executed when an exception occurs is described in the MSDN. The main page for this is Exception Handling (x64).

You are wondering about the fact that the addresses are DWORD (32-bit) and not QWORD?

The PE32+ executable size is limited to 2 gigabyte, so there is no need for 64-bit addresses and the addresses in the record are relative 32-bit addresses.

Stack trace

I am using stack tracing mostly in the case of unhandled exceptions as this helps a lot tracking down the issues behind.

The two main parts in the JCL for getting the stack trace of an exception are the hook for “RaiseException” in kernel32.dll and gathering the stack trace. For Win64 the hook is very much the same – we do replace the RaiseException import. For the stack tracing so far we do use two different methods. One method is raw stack tracing (we walk down the stack completely ourselfs) and the other method is walking down the stack using stack frames.

For Win64 raw stack tracing works, but there is too much “noise” in it (methods are listed several times) and that needs a few changes to deliver reliable results. I am not an assembler guy and so I have looked for alternative solutions. What I do use currently is CaptureStackBackTrace as temporary solution. It exists since Windows XP (Version 5.1) and is much easier to call than StackWalk64 that seems to exists also since Windows XP (5.1). Since the Win64 (x64) platform exists since Windows XP 64 (5.2) and Windows Server 2003 (5.2) using these functions is no problem. During my research I came across with a blog comment by Barry Kelly that there could be problems using StackWalk64. Even if CaptureStackBackTrace is different that means that the results should be monitored.

I have checked if the stack trace for the exception raised by the following snippet can be gathered correctly.

procedure TFoo.Test;
begin
  raise Exception.Create('This is a test exception.');
end;
 
procedure RttiInvokeTestCall;
var
  Context: TRttiContext;
  InstanceType: TRttiInstanceType;
  Foo: TValue;
begin
  Context := TRttiContext.Create;
  try
    InstanceType := Context.FindType('RttiInvokeTest.TFoo') as TRttiInstanceType;
    Foo := InstanceType.GetMethod('Create').Invoke(InstanceType.MetaclassType, []);
    InstanceType.GetMethod('Test').Invoke(Foo, []);
  finally
    Context.Free;
  end;
end;

It can as you can see in the JCL Except Dialog on the following screenshot.
(click on the screenshot for full size and yes it is really that large)

You are wondering why the forms look so dark. That is because I did enable VCL Styles for the application. VCL Styles are a new feature in XE2 and enable you to make your existing application look more modern. Several styles are delivered with XE2, you can also create your own and I have choosen a dark style (this one is called “Ruby Graphite”) that you directly see a difference.

The JCL does not support the OS X target at the moment, but however I have checked how one could get a stack trace on OS X. This is possible with the function backtrace in libdl.dylib. The definition looks like this

function backtrace(PointerArray: PPointerArray; 
  ASize: Integer): Integer; cdecl; 
  external 'libdl.dylib' name '_backtrace';

How one can hook exceptions is a completely different question.

As little proof of concept for backtrace I did this:

  • Extracted the map parsing classes from JclDebug and some helper functions into a new unit to get rid of unused JCL code that might not compile yet
  • Created a new FireMonkey HD application
    (FireMonkey is the new crossplatform framework in XE2)
  • Set target to OS X
  • Added a TPanel, TButton and TMemo to the FireMonkey HD form
  • Added a OnClick handler for the TButton, call backtrace there and resolve the addresses with the map parsing classes
  • Set “Map File” to Detailed in the Delphi Compiler linking options for the OS X target
  • Added the map file in the deployment tab that it automatically gets copied to the Mac
    (there is no need for copying anything manually to the Mac)
  • Started the Platform Assistant Server on the Mac
  • F9 in the IDE
  • Press the button on the form on the Mac

The result looks like this:
(click on the screenshot for full size)

The lines without the location info are probably from the OS, but I did not yet try to resolve them. Maybe backtrace_symbols does help here.

So that was all on topic for today.

This entry was posted in Uncategorized. Bookmark the permalink.

6 Responses to RAD Studio XE2: Win64 exceptions and stack traces on Win64 (JCL/JVCL 64) and OS X

  1. Pingback: Delphi XE2: 64bit, 3d GUI, MacOs, iOS en meer! | 4DotNet Developers Blog

  2. Pingback: Paweł Głowacki : Links to resources about Delphi XE2 and RAD Studio XE2

  3. Uwe,
    I could not make it work under x64 with the latest version of JCL:

    1. At first, it was imposssible to replace the RaiseException import entry.
    The line

    Inc(ImportEntry32);

    in TJclPeMapImgHooks.ReplaceImport (JclPEImage.pas) was incrementing the pointer by 4 bytes (!!!) under x64.
    I has to replace that line with

    Inc(ULONG_PTR(ImportEntry32), SizeOf(pointer));

    this produced the desired result.

    2. After an exception is raised, the condition

    (TJclAddr(Arguments) = TJclAddr(@Arguments) + SizeOf(Pointer))

    in HookedRaiseException (JclHooExcept.pas) was *not* satisfied.
    What does that condition mean?

    Thank you!

    • Uwe Schuster says:

      Looks like you’re either using the version from the XE2 Partner DVD or JCL release 2.3. Both don’t contain the required changes. The changes are only in the JCL SVN. I’ve committed them a week ago as revision 3601 till 3605.

  4. Dmitry Streblechenko says:

    Ah! That works!
    BTW, jcld16win64.inc is missing from the latest revision

    Thank you!

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="" highlight="">