WTFCrash()

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

WTFCrash()

Michael Catanzaro
Hi,

After [1], WTFCrash() is used only in debug builds on Darwin. For
Darwin release builds, inline assembly is used to trigger a SIGTRAP. I
experimented with this today and found it works quite badly on Linux,
somehow confusing gdb and clobbering the top frames of the stacktrace,
so I think we should leave that unchanged and keep it Darwin-only. So
this mail applies only to debug builds on Darwin, and to non-Darwin
ports.

Now, currently, WTFCrash() does three things:

 (1) Calls a user-configurable crash hook
 (2) Print a low-quality backtrace to stderr
 (3) Crash somehow:
   - If ASAN is used, with __builtin_trap() (that's SIGILL on Linux
x86_64)
   - Then with *(int *)(uintptr_t)0xbbadbeef = 0, which might fail to
crash if 0xbadbeef is a valid address, and is SIGSEGV otherwise
   - Then with __builtin_trap() if COMPILER(GCC_OR_CLANG)
   - Then with ((void(*)())0)() otherwise (presumably SIGSEGV or
SIGBUS, dunno)

This is all rather more complicated than it needs to be.

First off, the crash hook is (almost) unused and should simply be
removed, see [2].

Next, the low-quality backtrace. Does anyone think this is useful? It's
mainly annoying to me, because it's not anywhere near as good as a
proper backtrace that shows stack members, it's mangled so function
names are unnecessarily-difficult to read, and it takes all of five
seconds to get a much nicer one with modern Linux developer tools. If
other developers like it, perhaps we could keep it for debug builds
only, and skip right to the crashing in release builds? I suppose we
could keep printing it always if there is desire to do this, because it
has never caused any problems with Linux crash telemetry and doesn't
seem to be harming anything, but otherwise my instinct is to simplify.

Now, as for how exactly to crash. Current logic, with asan disabled,
prefers SIGSEGV, then SIGILL if that fails, then SIGSEGV again. I don't
like that WTFCrash() triggers a SIGSEGV. This is not very clean; at
least on Linux, it's conventional for assertion failures and other
intentional crashes to cause SIGABRT instead of trying to write to
0xbadbeef. SIGILL is also quite unusual. I  think I'd be happy if we
replace all of WTFCrash() with a single call to abort(). Any objections
to this? Is there a special reason for the current convoluted logic?

Michael

[1] https://bugs.webkit.org/show_bug.cgi?id=153996
[2] https://bugs.webkit.org/show_bug.cgi?id=183369

_______________________________________________
webkit-dev mailing list
[hidden email]
https://lists.webkit.org/mailman/listinfo/webkit-dev
Reply | Threaded
Open this post in threaded view
|

Re: WTFCrash()

Mark Lam


> On Mar 6, 2018, at 9:09 PM, Michael Catanzaro <[hidden email]> wrote:
>
> Hi,
>
> After [1], WTFCrash() is used only in debug builds on Darwin. For Darwin release builds, inline assembly is used to trigger a SIGTRAP. I experimented with this today and found it works quite badly on Linux, somehow confusing gdb and clobbering the top frames of the stacktrace, so I think we should leave that unchanged and keep it Darwin-only. So this mail applies only to debug builds on Darwin, and to non-Darwin ports.
>
> Now, currently, WTFCrash() does three things:
>
> (1) Calls a user-configurable crash hook
> (2) Print a low-quality backtrace to stderr
> (3) Crash somehow:
>  - If ASAN is used, with __builtin_trap() (that's SIGILL on Linux x86_64)
>  - Then with *(int *)(uintptr_t)0xbbadbeef = 0, which might fail to crash if 0xbadbeef is a valid address, and is SIGSEGV otherwise
>  - Then with __builtin_trap() if COMPILER(GCC_OR_CLANG)
>  - Then with ((void(*)())0)() otherwise (presumably SIGSEGV or SIGBUS, dunno)
>
> This is all rather more complicated than it needs to be.
>
> First off, the crash hook is (almost) unused and should simply be removed, see [2].
>
> Next, the low-quality backtrace. Does anyone think this is useful? It's mainly annoying to me, because it's not anywhere near as good as a proper backtrace that shows stack members, it's mangled so function names are unnecessarily-difficult to read, and it takes all of five seconds to get a much nicer one with modern Linux developer tools. If other developers like it, perhaps we could keep it for debug builds only, and skip right to the crashing in release builds? I suppose we could keep printing it always if there is desire to do this, because it has never caused any problems with Linux crash telemetry and doesn't seem to be harming anything, but otherwise my instinct is to simplify.

On Darwin, we currently only use WTFCrash() on Debug builds (see definition of the CRASH() macro).  Feel free to make linux do the same.  FWIW, I use this crash trace a lot when debugging crashes when I don’t already have a debugger attached yet.  Of course, with a debugger attached, it is of less value.

> Now, as for how exactly to crash. Current logic, with asan disabled, prefers SIGSEGV, then SIGILL if that fails, then SIGSEGV again. I don't like that WTFCrash() triggers a SIGSEGV. This is not very clean; at least on Linux, it's conventional for assertion failures and other intentional crashes to cause SIGABRT instead of trying to write to 0xbadbeef. SIGILL is also quite unusual. I  think I'd be happy if we replace all of WTFCrash() with a single call to abort(). Any objections to this? Is there a special reason for the current convoluted logic?

For Darwin, I think we want assertion failures to generate crash logs for post-mortem analysis.  I’m not sure if SIGABRT will give us the crash logs we want.

The choice of a SIGSEGV on 0xbadbeef (for debug builds) and WTFBreakpointTrap() e.g. int3 on x86_64 (for release builds), is so that we can discern the crash log for an assertion failure vs any other crash.

The reason for release builds using WTFBreakpointTrap() is so that we can get a crash with minimal perturbation to the register state, to help with post-mortem crash analysis.  Debug builds are only used during development and internal testing.  So, we take the opportunity to get more crash info there (hence, the dumping of the crash trace).  The reason that ASan builds use __builtin_trap() is because ASan does not like the memory access of 0xbadbbeef (if I remember correctly).

In addition, we also have infrastructure in place that looks for these crash patterns.  So, changing to use SIGABRT (even if it generates a crash log on Darwin) would mean additional cost and churn to update that infrastructure, with not much gain to show for it.  Hence, I object to the change.

Feel free to change the linux implementation of CRASH() to use abort() if that works better for linux folks.  BTW, CRASH() is what you want to define/redirect.  WTFCrash() is only one implementation of CRASH().  No client should be calling WTFCrash() directly.

Mark

> Michael
>
> [1] https://bugs.webkit.org/show_bug.cgi?id=153996
> [2] https://bugs.webkit.org/show_bug.cgi?id=183369
>
> _______________________________________________
> webkit-dev mailing list
> [hidden email]
> https://lists.webkit.org/mailman/listinfo/webkit-dev

_______________________________________________
webkit-dev mailing list
[hidden email]
https://lists.webkit.org/mailman/listinfo/webkit-dev
Reply | Threaded
Open this post in threaded view
|

Re: WTFCrash()

Michael Catanzaro

On Wed, Mar 7, 2018 at 12:36 AM, Mark Lam <[hidden email]> wrote:
> On Darwin, we currently only use WTFCrash() on Debug builds (see
> definition of the CRASH() macro).  Feel free to make linux do the
> same.  FWIW, I use this crash trace a lot when debugging crashes when
> I don’t already have a debugger attached yet.  Of course, with a
> debugger attached, it is of less value.

Unfortunately that somehow breaks the top frames in the backtrace on
Linux, so we can't do that.

> The reason for release builds using WTFBreakpointTrap() is so that we
> can get a crash with minimal perturbation to the register state, to
> help with post-mortem crash analysis.  Debug builds are only used
> during development and internal testing.  So, we take the opportunity
> to get more crash info there (hence, the dumping of the crash trace).
>  The reason that ASan builds use __builtin_trap() is because ASan
> does not like the memory access of 0xbadbbeef (if I remember
> correctly).
>
> In addition, we also have infrastructure in place that looks for
> these crash patterns.  So, changing to use SIGABRT (even if it
> generates a crash log on Darwin) would mean additional cost and churn
> to update that infrastructure, with not much gain to show for it.  
> Hence, I object to the change.

OK, interesting. I won't change the behavior for Darwin, then.

Our crash telemetry does not care about the crash signal that's used;
we'll get a nice backtrace regardless. Linux developers kinda expect
SIGABRT to be used for intentional crashes like WTFCrash(), though, so
it would be (very) slightly more useful for us to use abort().

> Feel free to change the linux implementation of CRASH() to use
> abort() if that works better for linux folks.  BTW, CRASH() is what
> you want to define/redirect.  WTFCrash() is only one implementation
> of CRASH().  No client should be calling WTFCrash() directly.

CRASH() always calls WTFCrash(), except in Darwin release builds. So
one option is to change only WTFCrash(). I experimented with changing
CRASH() to call abort() directly and decided it doesn't really matter
either way. The advantage of changing CRASH() is that there is one
fewer uninteresting frame at the top of the backtrace, since WTFCrash
will not be present. The cost is it will be very slightly harder to
notice that CRASH() was called, since WTFCrash() won't be present in
the backtrace.

Michael

_______________________________________________
webkit-dev mailing list
[hidden email]
https://lists.webkit.org/mailman/listinfo/webkit-dev
Reply | Threaded
Open this post in threaded view
|

Re: WTFCrash()

Michael Catanzaro
On Wed, Mar 7, 2018 at 10:37 AM, Michael Catanzaro
<[hidden email]> wrote:
> Unfortunately that somehow breaks the top frames in the backtrace on
> Linux, so we can't do that.

I mean, we can't do it in exactly the same way that Darwin does. Yes,
of course we can change CRASH() to call abort() or do whatever we want.

_______________________________________________
webkit-dev mailing list
[hidden email]
https://lists.webkit.org/mailman/listinfo/webkit-dev
Reply | Threaded
Open this post in threaded view
|

Re: WTFCrash()

Adrian Perez de Castro
In reply to this post by Michael Catanzaro
Hello,

Just a quick note below about the backtraces...

On Tue, 06 Mar 2018 23:09:18 -0600, Michael Catanzaro <[hidden email]> wrote:
 

> After [1], WTFCrash() is used only in debug builds on Darwin. For
> Darwin release builds, inline assembly is used to trigger a SIGTRAP. I
> experimented with this today and found it works quite badly on Linux,
> somehow confusing gdb and clobbering the top frames of the stacktrace,
> so I think we should leave that unchanged and keep it Darwin-only. So
> this mail applies only to debug builds on Darwin, and to non-Darwin
> ports.
>
> Now, currently, WTFCrash() does three things:
>
>  (1) Calls a user-configurable crash hook
>  (2) Print a low-quality backtrace to stderr
>  (3) Crash somehow:
>    - If ASAN is used, with __builtin_trap() (that's SIGILL on Linux
> x86_64)
>    - Then with *(int *)(uintptr_t)0xbbadbeef = 0, which might fail to
> crash if 0xbadbeef is a valid address, and is SIGSEGV otherwise
>    - Then with __builtin_trap() if COMPILER(GCC_OR_CLANG)
>    - Then with ((void(*)())0)() otherwise (presumably SIGSEGV or
> SIGBUS, dunno)
>
> This is all rather more complicated than it needs to be.
>
> First off, the crash hook is (almost) unused and should simply be
> removed, see [2].
>
> Next, the low-quality backtrace. Does anyone think this is useful? [...]
In some cases I have worked with embedded devices where it was extremely
difficult to even run “gdbserver”, mainly due to memory constraints, and
having the limited backtrace from WTFCrash() can save the day in those
cases.

I think it's perfectly fine to skim over and ignore the backtraces
printed by WTFCrash() if one has a better way of debugging.

> [...] It's mainly annoying to me, [...]

This is not a good reason to remove them. Just don't use them and skip
reading them and you are done ;-)

> [...] because it's not anywhere near as good as a proper backtrace that
> shows stack members, [...]

True, but sometimes it's all we can have (see above), which is always
better than nothing.

> [...] it's mangled so function names are unnecessarily-difficult to read,

Pipe the output to “c++filt” and you will get back demangled names.
Nifty little tool :D

> [...] and it takes all of five seconds to get a much nicer one with
> modern Linux developer tools.

That is, *IF* one has access to modern developers tools.

> [...] If other developers like it, perhaps we could keep it for debug builds
> only, and skip right to the crashing in release builds? I suppose we could
> keep printing it always if there is desire to do this, because it has never
> caused any problems with Linux crash telemetry and doesn't seem to be
> harming anything, but otherwise my instinct is to simplify.

I would enable printing backtraces to -DENABLE_DEVELOPER_MODE=ON instead,
because they can be useful in release (well, or RelWithDebInfo) builds.

Cheers,


--
 Adrián 🎩

_______________________________________________
webkit-dev mailing list
[hidden email]
https://lists.webkit.org/mailman/listinfo/webkit-dev

attachment0 (201 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: WTFCrash()

Michael Catanzaro
On Fri, Mar 9, 2018 at 5:34 AM, Adrian Perez de Castro
<[hidden email]> wrote:
> I would enable printing backtraces to -DENABLE_DEVELOPER_MODE=ON
> instead,
> because they can be useful in release (well, or RelWithDebInfo)
> builds.

OK, let's do this, then. I don't want to make it impossible to solve
bugs on small embedded systems.

I would really hate to work in an environment where you can't get a
proper backtrace... that sounds awful. :(

_______________________________________________
webkit-dev mailing list
[hidden email]
https://lists.webkit.org/mailman/listinfo/webkit-dev
Reply | Threaded
Open this post in threaded view
|

Re: WTFCrash()

Adrian Perez de Castro
On Fri, 09 Mar 2018 12:59:01 -0600, Michael Catanzaro <[hidden email]> wrote:
> On Fri, Mar 9, 2018 at 5:34 AM, Adrian Perez de Castro
> <[hidden email]> wrote:
> > I would enable printing backtraces to -DENABLE_DEVELOPER_MODE=ON
> > instead,
> > because they can be useful in release (well, or RelWithDebInfo)
> > builds.
>
> OK, let's do this, then. I don't want to make it impossible to solve
> bugs on small embedded systems.

Great, thanks!

> I would really hate to work in an environment where you can't get a
> proper backtrace... that sounds awful. :(

It is painful, yep ;-)

--
 Adrián 🎩

_______________________________________________
webkit-dev mailing list
[hidden email]
https://lists.webkit.org/mailman/listinfo/webkit-dev

attachment0 (201 bytes) Download Attachment