分享

解密Linux kernel中的内核线程

 Liucw2012 2012-03-23

*****************************摘要***********************************************

kernel_thread(kthreadd, NULL, CLONE_FS | CLONE_FILES);

       regs.ARM_pc = (unsigned long)kernel_thread_helper;

       do_fork(flags|CLONE_VM|CLONE_UNTRACED, 0, &regs, 0, NULL, NULL);

kernel_thread_helper(void);

       mov pc, r2代码的执行将会调用kthreadd内核线程函数。

       kthreadd(void *unused)

              while (!list_empty(&kthread_create_list))

                     create_kthread(create);

                            kernel_thread(kthread, create, CLONE_FS | CLONE_FILES | SIGCHLD);

threadfn = create->threadfn;
data = create->data;

__set_current_state(TASK_UNINTERRUPTIBLE);
              complete(&create->started);
              schedule();

                                   threadfn(data);

kthreadd will work on a global list named kthread_create_list, if the list is empty then the kthreadd will sleep until someone else wake it up.
Now let's see which one will update the
kthread_create_list meaning insert a node into the list. kthread_create() will insert a node named create into the list. After it insert the create into the kthread_create_list, it will call wake_up_process(kthreadd_task) to wake up the process which kernel thread function is kthreadd. In this case, kthreadd will create a new process which the initial state is TASK_UNINTERRUPTIBLE, so the new process will enter into sleep until someone wake it up.

********************************************************************************

因为所涉及的话题在代码的实现上是体系架构相关的,所以本贴基于ARM架构。
这里所谓的内核线程,实际上是由kernel_thread函数创建的一个进程,有自己独立的task_struct结构并可被调度器调度,这种进程的特殊之处在于它只在内核态运行。
Linux source code中, init/main.c中的rest_init()中就开始调用kernel_thread来构造内核线程了,比如:
kernel_thread(kernel_init, NULL, CLONE_FS | CLONE_SIGHAND);
kernel_thread(kthreadd, NULL, CLONE_FS | CLONE_FILES);
我们在源代码中通过跟踪kernel_thread(kthreadd, NULL, CLONE_FS | CLONE_FILES)的调用来揭示Linux中的这种特殊的内核态进程的背后秘密。
ARM中,kernel_thread定义如下:
/*
* Create a kernel thread.
*/
pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags)
{
struct pt_regs regs;
memset(&regs, 0, sizeof(regs));
regs.ARM_r1 = (unsigned long)arg;
regs.ARM_r2 = (unsigned long)fn;
regs.ARM_r3 = (unsigned long)do_exit;
regs.ARM_pc = (unsigned long)kernel_thread_helper;
regs.ARM_cpsr = SVC_MODE;
return do_fork(flags|CLONE_VM|CLONE_UNTRACED, 0, &regs, 0, NULL, NULL);
}
[注:这里有个调试方面的小技巧。当初海豚用BDI3000调试BALI板的时候,为了调试生成的内核线程代码,需要将上述do_fork中的CLONE_UNTRACED flag移除,重新编译内核,调试器才可停在内核线程函数代码所设的断点上]

CLONE_FS 父子进程共享文件系统信息

CLONE_SIGHAND 父子进程共享信号处理函数

CLONE_VM 父子进程共享地址空间

CLONE_UNTRACED 防止跟踪进程在子进程上强制执行CLONE_PTRACE

                                  /* set if the tracing process can't force CLONE_PTRACE on this clone */

CLONE_PTRACE  /* set if we want to let tracing continue on the child too */

kernel_thread函数的最后一行是调用do_fork来生成一个进程框架(主体结构是task_struct),在do_fork中会将新生成的进程执行入口点设置为ret_from_fork()。这样,当新进程被调度器调度时,将从ret_from_fork()函数开始执行。在ret_from_fork中,会调用kernel_thread函数中设置的ARM_pc,也就是说调用kernel_thread_helper.

 

kernel_thread_helper
/*
* Shuffle the argument into the correct register before calling the
* thread function. r1 is the thread argument, r2 is the pointer to
* the thread function, and r3 points to the exit function.
*/
extern void kernel_thread_helper(void);
asm( ".section .text\n"
" .align\n"
" .type kernel_thread_helper, #function\n"
"kernel_thread_helper:\n"
" mov r0, r1\n"
" mov lr, r3\n"
" mov pc, r2\n"
" .size kernel_thread_helper, . - kernel_thread_helper\n"
" .previous");

这段汇编代码将r1赋给r0r0在函数调用时作为传递参数寄存器。在1楼的kernel_thread函数中,regs.ARM_r1 = (unsigned long)arg;
r3
给了lr,实际上就是保存内核线程函数返回时的调用地址,在本例中,也就是kthreadd返回后所调用的函数,该函数为do_exit,这意味着当内核线程函数退出后,其所在的进程将会被销毁。所以,内核线程函数一般都不会轻易退出。

mov pc, r2
代码的执行将会调用kthreadd内核线程函数。

kthreadd

int kthreadd(void *unused)
{
struct task_struct *tsk = current;

/* Setup a clean context for our children to inherit. */
set_task_comm(tsk, "kthreadd");
ignore_signals(tsk);
set_user_nice(tsk, KTHREAD_NICE_LEVEL);
set_cpus_allowed(tsk, CPU_MASK_ALL);

current->flags |= PF_NOFREEZE;

for (;;) {
set_current_state(TASK_INTERRUPTIBLE);
if (list_empty(&kthread_create_list))
schedule();
__set_current_state(TASK_RUNNING);

spin_lock(&kthread_create_lock);
while (!list_empty(&kthread_create_list)) {
struct kthread_create_info *create;

create = list_entry(kthread_create_list.next,
struct kthread_create_info, list);
list_del_init(&create->list);
spin_unlock(&kthread_create_lock);

create_kthread(create);

spin_lock(&kthread_create_lock);
}
spin_unlock(&kthread_create_lock);
}

return 0;
}

kthreadd
的核心是一forwhile循环体。在for循环中,如果发现kthread_create_list是一空链表,则调用schedule调度函数,因为此前已经将该进程的状态设置为TASK_INTERRUPTIBLE,所以schedule的调用将会使当前进程进入睡眠。如果kthread_create_list不为空,则进入while循环,在该循环体中会遍历该kthread_create_list列表,对于该列表上的每一个entry,都会得到对应的类型为struct kthread_create_info的节点的指针create.

然后函数在kthread_create_list中删除create对应的列表entry,接下来以create指针为参数调用create_kthread(create).
create_kthread()函数中,会调用kernel_thread来生成一个新的进程,该进程的内核函数为kthread,调用参数为create:
kernel_thread(kthread, create, CLONE_FS | CLONE_FILES | SIGCHLD);

kthread--

static int kthread(void *_create)
{
struct kthread_create_info *create = _create;
int (*threadfn)(void *data);
void *data;
int ret = -EINTR;

/* Copy data: it's on kthread's stack */
threadfn = create->threadfn;
data = create->data;

/* OK, tell user we're spawned, wait for stop or wakeup */
__set_current_state(TASK_UNINTERRUPTIBLE);
complete(&create->started);
schedule();

if (!kthread_should_stop())
ret = threadfn(data);

/* It might have exited on its own, w/o kthread_stop. Check. */
if (kthread_should_stop()) {
kthread_stop_info.err = ret;
complete(&kthread_stop_info.done);
}
return 0;
}
kthread
会将其所在进程的状态设为TASK_UNINTERRUPTIBLE,然后调用schedule函数。所以,kthread将会使其所在的进程进入休眠状态,直到被别的进程唤醒。如果被唤醒,将会调用create->threadfn(create->data);

其中的kthread_should_stop()如果返回真,表明对于当前进程p,有别的进程调用了kthread_stopp),否则kthread_should_stop返回假。

 

Summary--

kthreadd will work on a global list named kthread_create_list, if the list is empty then the kthreadd will sleep until someone else wake it up.

Now let's see which one will update the kthread_create_list meaning insert a node into the list. kthread_create() will insert a node named create into the list. After it insert the create into the kthread_create_list, it will call wake_up_process(kthreadd_task) to wake up the process which kernel thread function is kthreadd. In this case, kthreadd will create a new process which the initial state is TASK_UNINTERRUPTIBLE, so the new process will enter into sleep until someone wake it up.

The work queue make use of the kthread_create.

Then comes to the last question, who will wake up the process created by the kthreaddd?
The question is, for the work queue, when the driver call __create_workqueue_key, the latter will call start_workqueue_thread to wake up the process created by the kthreadd. worker_thread will sleep on a wait queue until the driver call queue_work to insert a working node into this wait queue, and the worker_thread will be waked up by the queue_work meanwhile...

worker_thread has been woken up, that doesn't mean the worker_thread will be called immediatelly after queque_work being called, queue_work just change the state of worker_thread process to TASK_RUNNING, this worker_thread function will be called until next schedule point, because of the higher schedule priority, so the worker_thread will be called quickly upon the coming schedule point.

So-- from the work queue point of view, __create_workqueue_key() can be divided into 2 major parts: The first one will create a new process in the system with the help of kthreadd, which has a kernel thread function named worker_thread. The new process will enter into sleep state. The second one will call start_workqueue_thread() to wake up the process created in the first part, once woken up, the worker_thread will be executed, but it will enter sleep again because the wait queue is empty.
When driver call queue_work, it will insert a working node into the wait queue and wake up the worker_thread (put the worker_thread process into TASK_RUNNING state). In the coming schedule point, worker_thread will be called to handle all the nodes in the wait queue for which it's waiting.

    本站是提供个人知识管理的网络存储空间,所有内容均由用户发布,不代表本站观点。请注意甄别内容中的联系方式、诱导购买等信息,谨防诈骗。如发现有害或侵权内容,请点击一键举报。
    转藏 分享 献花(0

    0条评论

    发表

    请遵守用户 评论公约

    类似文章 更多