recover.panic.defer.2021.03.03

印度阿三17 2021-03-09

展开全文

Defer, Panic, and Recover

在 Go 语言中，recover 和 panic 的关系是什么？

我们先看一个基础的例子，在 main 方法体中启动一个协程，在协程内部主动调用 panic。程序的执行会被中断了，但有个疑问，为什么在别的协程里调用了 panic，要让 main 协程也退出呢？

func main() {
go func() {
panic("call panic")
}()

for{}
}

针对这种情况，我们引入 recover 方法。这里故意写了一段错误的代码，代码如下，运行的结果会怎么样呢？能 recover 住 panic 吗？

程序执行还是被中断了，recover 并没有起作用。因为 recover 没有写在 defer 函数里。实际上，recover 和 defer 联用，并且不跨协程，才能真正的拦截 panic 事件。

func main() {
go func() {
    
    // 追加的代码
if r := recover(); r != nil {
fmt.Println(r)
}

panic("call panic")
}()

for{}
}

正确的写法如下。这里描述的内容在 Go 博客Defer, Panic, and Recover 有详细解释。

func main() {
go func() {
defer func() {
if r := recover(); r != nil {
fmt.Println(r)
}
}()

panic("call panic")
}()

fmt.Println("come on")
}

Panic 和 Recover 的联系

在 panic 的过程中, panic 传入的参数用来作为 recover 函数的返回。

下面的例子中，声明了一个 inner 类型的结构体。panic 的时候，我们指定的入参是一个 inner 结构体变量，inner 的 Msg 成员值为 Thank。然后，我们对 recover 的返回做断言处理（因为返回类型为 interface），直接断言它为 inner 值类型。

工作中，我们经常遇到的切片下标越界，go 在处理到这种类型的 panic 时，默认传递的就是 runtime 包下的 boundsError（A boundsError represents an indexing or slicing operation gone wrong.）。

type inner struct {
Msg string
}

func main() {

defer func() {
if r := recover(); r != nil {
fmt.Print(r.(inner))
}
}()

panic(inner{Msg: "Thank"})
}

panic 嵌套

当程序 panic 之后，调用 defer 函数时又触发了程序再次 panic。在程序的错误栈输出信息中，三处 panic 的错误信息都输出了。

我们不使用任何 recover ，查看 panic 的输出信息。从代码末尾的注释中可以发现，三个 panic 都触发了，而且输出中也包含了三个 panic 的信息。

func main() {
    go func() {

        // defer 1
        defer func() {

            // defer 2
            defer func() {
                panic("call panic 3")
            }()

            panic("call panic 2")
        }()

        panic("call panic 1")
    }()

    for{}
}

//output:
//panic: call panic 1
//        panic: call panic 2
//        panic: call panic 3
//
//goroutine 18 [running]:
//main.main.func1.1.1()
//        /Users/fuhui/Desktop/panic/main.go:10  0x39

接下来，我们代码做 recover 处理，观察程序的输出情况。上面的示例中，程序依次触发了 panic 1、2、3。现在我们修改代码，对 panic 3 做捕获处理，程序还会继续 panic 吗？

我们在代码中又嵌套追加了第三个 defer，对 panic 3 进行捕获。从代码的输出结果中，我们可以发现，代码还是 panic 了。

虽然我们还不了解具体的实现，但至少我们可以明白：Go 程序中的 panic 都需要被 recover 处理掉，才不会触发程序终止。如果只处理链路中的最后一个，程序还是会异常终止。

我们稍作调整，在 defer 3 中再写三个 recover 语句可行吗？这样也是不可行的，defer、panic、recover 需要是一体的，大家可以自行验证。

func main() {
    go func() {

        // defer 1
        defer func() {

            // defer 2
            defer func() {

                // defer 3
                defer func() {
                    if r := recover(); r != nil{
                        fmt.Println("recover", r)
                    }
                }()

                panic("call panic 3")
            }()

            panic("call panic 2")
        }()

        panic("call panic 1")
    }()

    for{}
}

//output:
//recover panic 3
//panic: call panic 1
//        panic: call panic 2
//
//goroutine 18 [running]:

源码

Go 源码版本

确定 Go 源码的版本

➜  server go version
go version go1.15.1 darwin/amd64

gopanic

我们来看 panic 的类型结构：

arg 作为 panic 是的入参，对应我们调用 panic 函数是的入参。在后续 recover 的时候会返回这个参数。

link 作为一个 _panic 类型指针，通过这个类型，可以说明：在 Goroutine 内部 _panic 是按照链表的结构存储的。在一个 goroutine 内，可能会出现多个 panic，但这些 panic 信息都会被存储。

// A _panic holds information about an active panic.
//
// This is marked go:notinheap because _panic values must only ever
// live on the stack.
//
// The argp and link fields are stack pointers, but don't need special
// handling during stack growth: because they are pointer-typed and
// _panic values only live on the stack, regular stack pointer
// adjustment takes care of them.
//
//go:notinheap
type _panic struct {
argp      unsafe.Pointer // pointer to arguments of deferred call run during panic; cannot move - known to liblink
arg       interface{}    // argument to panic
link      *_panic        // link to earlier panic
pc        uintptr        // where to return to in runtime if this panic is bypassed
sp        unsafe.Pointer // where to return to in runtime if this panic is bypassed
recovered bool           // whether this panic is over
aborted   bool           // the panic was aborted
goexit    bool
}

gopanic 方法体代码比较长，我们直接在注释中对它进行标注和分析

// The implementation of the predeclared function panic.
func gopanic(e interface{}) {
gp := getg()
if gp.m.curg != gp {
print("panic: ")
printany(e)
print("\n")
throw("panic on system stack")
}

if gp.m.mallocing != 0 {
print("panic: ")
printany(e)
print("\n")
throw("panic during malloc")
}
if gp.m.preemptoff != "" {
print("panic: ")
printany(e)
print("\n")
print("preempt off reason: ")
print(gp.m.preemptoff)
print("\n")
throw("panic during preemptoff")
}
if gp.m.locks != 0 {
print("panic: ")
printany(e)
print("\n")
throw("panic holding locks")
}
    
    // 创建了这个 panic 对象，将这个 panic 对象的 link 指针指向当前 goroutine 的 _panic 列表
    // 说白了就是一个链表操作，将当前 panic 插入到当前 goroutine panic 链表的首位置
var p _panic
p.arg = e
p.link = gp._panic
gp._panic = (*_panic)(noescape(unsafe.Pointer(&p)))

atomic.Xadd(&runningPanicDefers, 1)

// By calculating getcallerpc/getcallersp here, we avoid scanning the
// gopanic frame (stack scanning is slow...)
addOneOpenDeferFrame(gp, getcallerpc(), unsafe.Pointer(getcallersp()))

for {
    
    // 循环获取 gp 的 defer，这里不展开，但 _defer 也是跟 _panic 一样按照链表结构进行存储的。
d := gp._defer
if d == nil {
break
}

// If defer was started by earlier panic or Goexit (and, since we're back here, that triggered a new panic),
// take defer off list. An earlier panic will not continue running, but we will make sure below that an
// earlier Goexit does continue running.
if d.started {
if d._panic != nil {
d._panic.aborted = true
}
d._panic = nil
if !d.openDefer {
// For open-coded defers, we need to process the
// defer again, in case there are any other defers
// to call in the frame (not including the defer
// call that caused the panic).
d.fn = nil
gp._defer = d.link
freedefer(d)
continue
}
}

// Mark defer as started, but keep on list, so that traceback
// can find and update the defer's argument frame if stack growth
// or a garbage collection happens before reflectcall starts executing d.fn.
d.started = true

// Record the panic that is running the defer.
// If there is a new panic during the deferred call, that panic
// will find d in the list and will mark d._panic (this panic) aborted.
d._panic = (*_panic)(noescape(unsafe.Pointer(&p)))

done := true
if d.openDefer {
done = runOpenDeferFrame(gp, d)
if done && !d._panic.recovered {
addOneOpenDeferFrame(gp, 0, nil)
}
} else {
p.argp = unsafe.Pointer(getargp(0))
reflectcall(nil, unsafe.Pointer(d.fn), deferArgs(d), uint32(d.siz), uint32(d.siz))
}
p.argp = nil

// reflectcall did not panic. Remove d.
if gp._defer != d {
throw("bad defer entry in panic")
}
d._panic = nil

// trigger shrinkage to test stack copy. See stack_test.go:TestStackPanic
//GC()

pc := d.pc
sp := unsafe.Pointer(d.sp) // must be pointer so it gets adjusted during stack copy
if done {
d.fn = nil
gp._defer = d.link
freedefer(d)
}
if p.recovered {
gp._panic = p.link
if gp._panic != nil && gp._panic.goexit && gp._panic.aborted {
// A normal recover would bypass/abort the Goexit.  Instead,
// we return to the processing loop of the Goexit.
gp.sigcode0 = uintptr(gp._panic.sp)
gp.sigcode1 = uintptr(gp._panic.pc)
mcall(recovery)
throw("bypassed recovery failed") // mcall should not return
}
atomic.Xadd(&runningPanicDefers, -1)

if done {
// Remove any remaining non-started, open-coded
// defer entries after a recover, since the
// corresponding defers will be executed normally
// (inline). Any such entry will become stale once
// we run the corresponding defers inline and exit
// the associated stack frame.
d := gp._defer
var prev *_defer
for d != nil {
if d.openDefer {
if d.started {
// This defer is started but we
// are in the middle of a
// defer-panic-recover inside of
// it, so don't remove it or any
// further defer entries
break
}
if prev == nil {
gp._defer = d.link
} else {
prev.link = d.link
}
newd := d.link
freedefer(d)
d = newd
} else {
prev = d
d = d.link
}
}
}

gp._panic = p.link
// Aborted panics are marked but remain on the g.panic list.
// Remove them from the list.
for gp._panic != nil && gp._panic.aborted {
gp._panic = gp._panic.link
}
if gp._panic == nil { // must be done with signal
gp.sig = 0
}
// Pass information about recovering frame to recovery.
gp.sigcode0 = uintptr(sp)
gp.sigcode1 = pc
mcall(recovery)
throw("recovery failed") // mcall should not return
}
}

// ran out of deferred calls - old-school panic now
// Because it is unsafe to call arbitrary user code after freezing
// the world, we call preprintpanics to invoke all necessary Error
// and String methods to prepare the panic strings before startpanic.
preprintpanics(gp._panic)

fatalpanic(gp._panic) // should not return
*(*int)(nil) = 0      // not reached
}

gorecover

源码中的 getg() 方法返回当前的 goroutine，之后是获取当前 Go 的 panic 信息。紧接着 if 判断，如果条件符合的话，将这个 panic 对象的 recovered 属性设置为 true，也就是标记为被处理了，并返回的是这个 panic 的参数。如果 if 条件不满足的话，表示没有 panic 对象被捕获，返回空。

// The implementation of the predeclared function recover.
// Cannot split the stack because it needs to reliably
// find the stack segment of its caller.
//
// TODO(rsc): Once we commit to CopyStackAlways,
// this doesn't need to be nosplit.
//go:nosplit
func gorecover(argp uintptr) interface{} {
// Must be in a function running as part of a deferred call during the panic.
// Must be called from the topmost function of the call
// (the function used in the defer statement).
// p.argp is the argument pointer of that topmost deferred function call.
// Compare against argp reported by caller.
// If they match, the caller is the one who can recover.
gp := getg()
p := gp._panic
if p != nil && !p.goexit && !p.recovered && argp == uintptr(p.argp) {
p.recovered = true
return p.arg
}
return nil
}

注：recover函数捕获的是祖父一级调用函数栈的异常。必须要和有异常的栈帧只隔一个栈帧，recover函数才能正捕获异常。

来源：https://www./content-4-883851.html