修改 Minix 内核的进程调度

the minix book

国外很多大学把这本经典 Minix 书:Operating Systems Design and Implementation, 3/E 作为操作系统课程的教科书或者参考教材。这本书写的非常易懂,除了操作系统的理论基础而且还搭上 Minix 的源代码讲解,是学习操作系统和系统编程的绝好材料。Linus Torvalds 就是看了这本书写出了 Linux,并和这本书的作者、操作系统大牛 Andrew S. Tanenbaum 教授有段关于操作系统设计(微内核还是单内核)的著名争论:Tanenbaum–Torvalds debate.

软件的基础是操作系统,操作系统的核心是内核,内核的核心是进程调度和内存管理,所以 hack 内核的进程调度会是一件有趣的事情。VPSee 最近在温习那本牛书,收获很大,平时看书都看得明白,真正到了动手做的时候又是另外一回事,所以读万卷书、行万里路是有道理的。那本牛书的第107页讲到了 lottery scheduling,VPSee 决定把原来的 Minix 进程调度改成 lottery 算法调度 (lottery scheduling),当作是看书后的练习,有很多大学也把这个练习作为操作系统的课程设计,比如:UCSB 的 Project 1: User Shell and Lottery Scheduling. 关于 lottery 算法这里不做介绍了,网上一大把,这里只谈代码实现,修改代码分成下面几个部分,分别涉及到:proc.h, proc.c, do_fork.c, main.c 这几个文件,这里的代码是在2007年写的,基于Minix 3.1.1,其他版本的 Minix 可能不能运行。

代码

首先在进程数据结构里定义一个 tickets 变量:

struct proc {
  struct stackframe_s p_reg;	/* process' registers saved in stack frame */

  struct proc *p_nextready;	/* pointer to next ready process */
  struct proc *p_caller_q;	/* head of list of procs wishing to send */
  struct proc *p_q_link;		/* link to next proc wishing to send */
  message *p_messbuf; 	/* pointer to passed message buffer */
  proc_nr_t p_getfrom; 		/* from whom does process want to receive? */
  proc_nr_t p_sendto;		/* to whom does process want to send? */

  sigset_t p_pending; 		/* bit map for pending kernel signals */

  char p_name[P_NAME_LEN]; 	/* name of the process, including \0 */

  int p_tickets; 		/* tickets of the process */

#if DEBUG_SCHED_CHECK
  int p_ready, p_found;
#endif
};

在系统进程表里给系统进程初始化 tickets,tickets 大小范围在1到100之间:

PUBLIC void main()
{
...
for (rp = BEG_PROC_ADDR, i = -NR_TASKS; rp < END_PROC_ADDR; ++rp, ++i) {
        rp->p_rts_flags = SLOT_FREE;		/* initialize free slot */
        rp->p_nr = i;				/* proc number from ptr */
        (pproc_addr + NR_TASKS)[i] = rp;	/* proc ptr from number */

        if (i == -4)
                rp->p_tickets = 1;
        else
                rp->p_tickets = 100 + rand()%11;
 }

除了上面的系统进程外,给每个 fork 出来的用户进程初始化 tickets,tickets 大小范围在100到105之间,这样重新调度的机会比系统进程要小一些,系统进程是100到110之间:

PUBLIC int do_fork(m_ptr)
register message *m_ptr;        /* pointer to request message */
{
  /* Parent and child have to share the quantum that the forked process had,
   * so that queued processes do not have to wait longer because of the fork.
   * If the time left is odd, the child gets an extra tick.
   */
  rpc->p_ticks_left = (rpc->p_ticks_left + 1) / 2;
  rpp->p_ticks_left =  rpp->p_ticks_left / 2;

  rpc->p_ticket = 1 + rand()% 5;

  /* If the parent is a privileged process, take away the privileges from the
   * child process and inhibit it from running by setting the NO_PRIV flag.
}

接下来就是 lottery 核心算法和进程调度部分,用 lottery 算法选一个进程出来运行:

PRIVATE void pick_lot_proc()
{
  register struct proc *rp;                     /* process to run */
  register struct proc *prev_rp = NIL_PROC;
  int q;                                        /* iterate over queues */
  int sum = 0, lottery = 0;

  for (q = 0; q < NR_SCHED_QUEUES; q++) {
      rp = rdy_head[q];
      while (rp != NIL_PROC) {
          sum += rp->p_ticket;
          rp = rp->p_nextready;
      }
  }
  lottery = 1 + rand() % sum;
  sum = 0;
  for (q = 0; q < NR_SCHED_QUEUES; q++) {
      rp = rdy_head[q];
      while (rp != NIL_PROC) {
          sum += rp->p_ticket;
          if (sum >= lottery)
              goto next;
          prev_rp = rp;
          rp = rp->p_nextready;
      }
  }

  next:
  next_ptr = rp;		/* pick up rp as the next running process */
  if (priv(rp)->s_flags & BILLABLE)
      bill_ptr = rp; 
  return;
}

编译 Minix 内核:

# cd /usr/src/tools
# make hdboot
# make services

重启 Minix,如果正常运行的话,启动速度应该比以前要慢。如果系统启动不了,或者报错、kernel panic 等,可以在启动时候选择 “2 Start small Minix 3″,使用另外一个没有被破坏的 kernel image 启动系统进行 debug.

评论 (29 Comments)

  1. Hi can u please share the exact file names where this modification has to be made..i tried implementing ur code but its giving me kernel panic

  2. The files you need to modify are: proc.h, proc.c, do_fork.c, main.c.

  3. Hi ,
    Thanks for the reply.
    I made the below changes as told by u :
    1. adding p_tickets in proc.h
    2. adding
    if (i == -4)
    rp->p_tickets = 1;
    else
    rp->p_tickets = 100 + rand()%11;
    in main.c ==> can you please explain the logic behind if(i==-4)

    3. adding
    rpc->p_ticket = 1 + rand()% 5;
    in do_fork.c

    4. and replacing pick_proc() code with
    register struct proc *rp; /* process to run */
    register struct proc *prev_rp = NIL_PROC;
    int q; /* iterate over queues */
    int sum = 0, lottery = 0;

    for (q = 0; q p_ticket;
    rp = rp->p_nextready;
    }
    }
    lottery = 1 + rand() % sum;
    sum = 0;
    for (q = 0; q p_ticket;
    if (sum >= lottery)
    goto next;
    prev_rp = rp;
    rp = rp->p_nextready;
    }
    }

    next:
    next_ptr = rp; /* pick up rp as the next running process */
    if (priv(rp)->s_flags & BILLABLE)
    bill_ptr = rp;
    return;

    I SEE U HAVE CREATED pick_lot_proc() function …does it mean u creating a new function if yes i don’t see you calling it anywhere or you just replacing the existing pick_proc() code with this code

    HOWEVER after i made the above mentioned changes i get kernel panic: softnotify non-NULL after receive error.

    I have trying to implement lottery scheduling for quite a long time…but had no success then came across ur code….it will great if u can kindly please let me know the fix for this problem

  4. just to add i replaced the pick_proc code with ur pick_lot_proc code

  5. Yes, you need to call pick_lot_proc() in enqueue() and dequeue(), for speed reason you don’t want the q which is less than 0 use your lottery scheduling, so I did like this:
    if (q > 0) pick_lot_proc();
    else pick_proc();
    That’s why I didn’t replace the existing pick_proc()
    This code works on Minix 3.1.1, It may not work on other versions. please make sure you are running the same version. Good luck:)

  6. Hi Vpsee,

    Thank you so much for your help. However, I still dont understand why to call pick_lot_proc() in dequeue . And if yes what condition to chose for that. I am using 3.1.4 version of minix . I guess that might be problem

  7. I observed that in my case the exception is happening in main.c after i added the above mentioned lines of code. It gives me kernel panic there

  8. Can you post the patch file on line please?

  9. There are 2 pick_proc() calls in proc.c, right (v3.1.1)? one is in enqueue() and another one is in dequeue(), so I just looked through the whole proc.c and simply replaced them with my pick_lot_proc()/pick_proc() combine if there is a pick_proc().

    You mentioned that what’s the logic behind “if (i==-4)” in main.c.
    The reason is speed, you don’t want to assign a high ticket to IDLE process (-4), otherwise your new lottery kernel would be extremely slow., so I gave the IDLE process only 1 ticket manually.

  10. Sorry, I don’t have those patch files. I did this code in 2007 and this post is based on one of my old notes.

  11. Can you tell me a little about how you tested your new lottery scheduling, please? In other word, how do you know it worked?

  12. First off, If my lottery kernel can boot without a panic, that’s a good sign :-). I tested my lottery kernel by feeding different processes with different tickets and to see if there is difference, like:
    1. giving each of processes the same ticket;
    2. giving system processes a higher ticket and user processes a lower ticket;
    3. giving a specific process a very high ticket, etc.

    Of course I have to build different kernel images. If all the processes have the same ticket (chance), the system would be very slow, because some important processes (system processes) have to wait (blocked) until they get a chance to run. So I think by feeding different tickets you can tell if your kernel works or not. But this doesn’t explain how I tested the lottery algorithm itself. That’s the easy part, just write a program to test it before adding to the kernel.

    Hope it helps.

  13. There are 2 pick_proc() calls in proc.c, right (v3.1.1)? one is in enqueue() and another one is in dequeue(), so I just looked through the whole proc.c and simply replaced them with my pick_lot_proc()/pick_proc() combine if there is a pick_proc().

    As you mentioned above, I still don’t understand your logic for when to use pick_lot_proc() and when to use pick_proc(). Are you replacing pick_proc() completely with pick_lot_proc()? or when do you use pick_lot_proc() and when do use pick_proc()?

    By the way, thank you so much for the quick reply.

  14. you can COMPLETELY replace pick_proc() with pick_lot_proc(), the resulting kernel would be slow but it works. (100% lottery kernel implementation)

    for me, i wanted a faster lottery kernel, so i decided to pick_lot_proc() from those USER processes (q>0) and pick_proc() from SYSTEM processes (q<=0). which means only the USER processes do the lottery scheduling and the SYSTEM processes don't. it's kind of trade-off between performance and 100% pure lottery implementation. (90% lottery kernel implementation)

  15. Did I neeed to define system call to use tickect in PROC strcture?

  16. no, you don’t have to. but you can define a system call to use your ticket if you like.

  17. I use Minix 1.3.5 , What is new features in this release so I can’t use u’r code?

  18. I only tested it on Minix 3.1.1, not sure if it would work on the new version. As far as I know, Minix 3.1.5 changed a lot in kernel.

  19. VPSee,我也正在学习Minix3,也想尝试下彩票调度,我的修改基于3.1.6.修改步骤基本和你一样,但由于是3.1.6,所有进行了一些更改:
    (我的修改基于3.1.6)
    我按照下面的步骤进行的:
    1.给proc结构添加了p_tickets;
    2.在main的初始化进程表循环中,初始化了p_tickets,也就是添加了下面的语句:

    if (i == -4)
    rp->p_tickets = 1;
    else
    rp->p_tickets = 100 + rand()%11;

    3.然后在do_fork.c中,添加了这个语句:
    rpc->p_tickets=1 + rand()%5;
    4.然后在proc.c中,修改了pick_proc函数,改为了:

    PRIVATE struct proc* pick_lot_proc()
    {
    register struct proc *rp; /* process to run */
    int q; /* iterate over queues */
    int sum = 0, lottery = 0;

    for (q = 0; q p_ticket;
    rp = rp->p_nextready;
    }
    }
    lottery = 1 + rand() % sum;
    sum = 0;
    for (q = 0; q p_ticket;
    if (sum >= lottery){
    if(priv(rp)->s_flags & BILLABLE)
    bill_ptr=rp;
    return rp;
    }
    rp = rp->p_nextready;
    }
    }
    }

    这就是我主要的更改,make hdboot之后,重启,不能启动,下面是主要的输出:

    Loading Boot image 3.1.6 revision 26.
    kernel pm vfs rs 中间省略 init(5374K)

    *** kernel message:
    Divide error
    is_nested = 1 vec_nr = 0, trap_errno =0x0, eip =0x3a73, cs = 0x30, elags=0x46 trap_esp 0x00006890
    scheduled was: process 6 (ds) , pc = 7:0x100a
    kernel panic:exception in a kernel task 6
    kernel :0x535d 0xc023 0xd0a 0x1b51 0xd17
    cs: RPL 0, ind 6 of GDT -> base 0x00001000 size 0x00013ef0 exec nonacc DPL 0
    ds: RPL 0, ind 3 of GDT -> base 0x00015000 size 0x18feb000 nonexec DPL 0

    操作系统设计与实现的书我还没有看完,但我感觉我的修改应该是正确的啊,但不幸的是,它确实是错误的,能否告诉我哪里出错了。
    一万个谢谢。

  20. 另外,如果能把回复的副本发送到我的邮箱[email protected],感谢你的帮助

  21. 上面的例子只在 Minix 3.1.1 上能编译并运行,据我所知 3.1.5 和以后版本改动很大。

  22. 我也是minix3的学习者,请教各位,我在修改源码中一些文件(如proc.c he callnr.h)时,提示无法写入,各位帮帮忙,有什么办法可以写入

  23. 你的意思是说,你下载了源代码、解开了源代码,但是不能(没有权限)编辑源代码吗?查看一下源代码文件的权限 ls -l,然后试试 chmod +w

  24. 那个问题解决了,是我装系统的时候犯得错,仍然感谢你的帮助。
    你是否试过在这个系统里试着添加一个自己编写的系统调用,我在做的时候,无法将汇编文件加入到函数库中,总是出错。

  25. 请问minix3的上网设置怎么设置?我已经设置了,怎么去浏览网站呢?使用ping命令的时候,有的可以ping通,有的ping不通,怎样才能知道可以上网了 呢?

  26. 装个 x window,然后装浏览器,或者可在字符模式下用 lynx 之类的。

  27. 不好意思,我不太懂你说的,我是用虚拟机装的minix系统。能说的详细点吗?

  28. 你好 我想问一下你上面的q是什么变量

  29. 您好!我知道现在离您写这些代码已经过了差不多一个decade,但是还是想问一下:有没有可能把您在main.c的修改在schedule.c里面实现?什么思路?另外我现在用的是v3.2.1和您当时使用的版本差了很多,请问如果参考您的修改,有什么需要注意的?感谢!

发表评论