Unwind_SjLj the Top Items in Your Profile?
You've been diligently optimizing your iOS application -- static and stack allocations instead of dynamic, inlining code and removing virtual invocations that turn out not to be needed, algorithmic simplifications ... you know, the usual. You expect to see 'operator new' and 'operator delete' working their way up your profile, but what are '_Unwind_SjLj_Register' and '_Unwind_SjLj_Unregister'?
Functions that have cleanup to be done in the event of an exception being thrown are very common. On iOS and other systems 'setjmp/longjmp' (SjLj) exception handling is used. To support the mechanism the compiler inserts into every exception-savvy function a call to _Unwind_SjLj_Register as part of the function prologue and a matching call to _Unwind_SjLj_Unregister in the epilogue.
_Unwind_SjLj_Register and _Unwind_SjLj_Unregister are part of libunwind, the version of which used prior to iOS 5.0 has a nasty performance problem (see lines 80 to 118 of Unwind-sjlj.c)
In pseudo code, Register(theNewContext) does this:
That's perfectly reasonable. Unfortunately GetCurrentContext looks like this:
The calls to pthread_once(SetupKey) ensure there is single pthread_key_t allocated for libunwind's use. pthread_once is quick, but it's being called three times for every exception-savvy function.
An alternative would be to allocate that key as part of the process' static initialization, and since iOS 5 a more extreme version of that strategy is used — key 18 is hardcoded for use by libunwind (see line 86 in pthread_machdep.h).
Unwind_SjLj_Faster.c provides implementations of _Unwind_SjLj_Register and _Unwind_SjLj_Unregister that do not use pthread_once. Instead, we identify the key the system code has allocated and then do only what the pseudo code above does -- get the prior value, put it in the new context's prev and set the new context as the current.
The system-provided _Unwind_SjLj_Register/_Unwind_SjLj_Unregister in iOS 5 are more efficient than these replacements. So if we detect that the allocated key is 18 we assume we're on such a system, and call through to the original code.
Functions that have cleanup to be done in the event of an exception being thrown are very common. On iOS and other systems 'setjmp/longjmp' (SjLj) exception handling is used. To support the mechanism the compiler inserts into every exception-savvy function a call to _Unwind_SjLj_Register as part of the function prologue and a matching call to _Unwind_SjLj_Unregister in the epilogue.
_Unwind_SjLj_Register and _Unwind_SjLj_Unregister are part of libunwind, the version of which used prior to iOS 5.0 has a nasty performance problem (see lines 80 to 118 of Unwind-sjlj.c)
In pseudo code, Register(theNewContext) does this:
theNewContext->prev = GetCurrentContext() SetCurrentContext(theNewContext)and Unregister(theCurrentContext) does this:
SetCurrentContext(theCurrentContext->prev)
That's perfectly reasonable. Unfortunately GetCurrentContext looks like this:
pthread_once(SetupKey) return pthread_get_specific(sKey)and SetCurrentContext(theContext):
pthread_once(SetupKey) pthread_Set_specific(sKey, theContext)
The calls to pthread_once(SetupKey) ensure there is single pthread_key_t allocated for libunwind's use. pthread_once is quick, but it's being called three times for every exception-savvy function.
An alternative would be to allocate that key as part of the process' static initialization, and since iOS 5 a more extreme version of that strategy is used — key 18 is hardcoded for use by libunwind (see line 86 in pthread_machdep.h).
Unwind_SjLj_Faster.c provides implementations of _Unwind_SjLj_Register and _Unwind_SjLj_Unregister that do not use pthread_once. Instead, we identify the key the system code has allocated and then do only what the pseudo code above does -- get the prior value, put it in the new context's prev and set the new context as the current.
The system-provided _Unwind_SjLj_Register/_Unwind_SjLj_Unregister in iOS 5 are more efficient than these replacements. So if we detect that the allocated key is 18 we assume we're on such a system, and call through to the original code.