澳门新葡萄京官网首页Android 进程优先级和 LowMemoryKiller 机制 – 4

一. 概述

Android的宏图意见之一,就是应用程序退出,但经过还有可能会持续存在系统以便再度运行时进步响合时间.
那样的安排会拉动一个主题素材,
各类进程都有友好单独的内部存款和储蓄器地址空间,随着应用展开数量的加码,系统已使用的内部存款和储蓄器越来越大,就很有一点都不小希望导致系统内部存款和储蓄器不足,
那么须求一个能管理全数进度,依据早晚战略来刑释进度的计策,那便有了lmk,全称为LowMemoryKiller(低内部存款和储蓄器徘徊花State of Qatar,lmkd来调节哪些日子杀掉什么进度.

Android基于Linux的系统,其实Linux有相似的内部存款和储蓄器管理战略——OOM
killer,全称(Out Of Memory KillerState of Qatar,
OOM的大旨更加多的是用来分配内部存款和储蓄器不足时接触,将得分最高的历程杀掉。而lmk则会每隔一段时间检查一次,当系统剩余可用内部存款和储蓄器极低时,便会触发杀进度的计谋,依照分化的剩余内部存款和储蓄器档位来来选拔杀分裂优先级的历程,而不是等到OOM时再来杀进度,真正OOM时系统也许已经处于非常情况,系统更期望的是准备,在内部存款和储蓄器好低时来杀掉一部分初期级异常的低的经过来保险继续操作的顺遂举行。

接上篇

二. framework层

位于ProcessList.java中定义了3种命令类型,这几个文件的概念必得跟lmkd.c概念完全一致,格式分别如下:

LMK_TARGET <minfree> <minkillprio> ... (up to 6 pairs)
LMK_PROCPRIO <pid> <prio>
LMK_PROCREMOVE <pid>
功能 命令 对应方法 触发时机
更新oom_adj LMK_TARGET updateOomLevels AMS.updateConfiguration
设置进程adj LMK_PROCPRIO setOomAdj AMS.applyOomAdjLocked
移除进程 LMK_PROCREMOVE remove AMS.handleAppDiedLocked/cleanUpApplicationRecordLocked

在后边作品Android进度调治之adj算法中有讲到AMS.applyOomAdjLocked,接下去以那些历程为主线最初剖判。

三 Low Memory Killer

Andorid的 Low Memory Killer 是在正规的linux lernel的 OOM
根底上改造而来的一种内部存款和储蓄器管理机制。当系统内部存款和储蓄器不足时,杀死无需的进程释放其内部存款和储蓄器。不须要的进度的选拔依照有2个:oom_adj和占用的内部存款和储蓄器的大大小小。oom_adj
代表经过的优先级,数值越高,优先级越低,越轻巧被杀掉;对应种种oom_adj都得以有三个悠闲进度的阀值。Android
Kernel每间距一段时间会检查评定当前闲暇内存是不是低于有个别阀值。若是是,则杀死oom_adj最大的不供给的长河,即使有八个,就依靠oom_score_adj 去杀死进程,,直到内部存款和储蓄器苏醒低于阀值的气象。

LowMemoryKiller 的阈值的设定,重要保存在2个公文之中,分别是:

  • /sys/module/lowmemorykiller/parameters/adj
  • /sys/module/lowmemorykiller/parameters/minfree

adj保存着日前系统杀进度的等第,minfree则是保留着相应的内部存款和储蓄器阀值。

Nexus6 Android7.0 系统的装置(源码编写翻译的 OS,也许和末段道具不相同等卡塔尔:

shamu:/ # cat /sys/module/lowmemorykiller/parameters/adj
0,100,200,300,900,906
shamu:/ # cat /sys/module/lowmemorykiller/parameters/minfree
18432,23040,27648,32256,36864,46080

例如:将1,6写入节点/sys/module/lowmemorykiller/parameters/adj,将1024,8192写入节点/sys/module/lowmemorykiller/parameters/minfree

焦点:当系统可用内部存款和储蓄器低于81玖拾一个pages时,则会杀掉oom_score_adj>=6的进度;当系统可用内存低于10三十多个pages时,则会杀掉oom_score_adj>=1的进程。

2.1 AMS.applyOomAdjLocked

private final boolean applyOomAdjLocked(ProcessRecord app, boolean doingAll, long now,
        long nowElapsed) {
    ...
    if (app.curAdj != app.setAdj) {
        //【见小节2.2】
        ProcessList.setOomAdj(app.pid, app.info.uid, app.curAdj);
        app.setAdj = app.curAdj;
    }
    ...
}

3.1 lmkd 守护进程

LMK
的进程是lmkd守护进度,随着系统的启航而运维的。完成源码要在system/core/lmkd/lmkd.c

lmkd会创制名称叫lmkd的socket,节点坐落于/dev/socket/lmkd,该socket用于跟上层framework交互作用。

service lmkd /system/bin/lmkd
    class core
    critical
    socket lmkd seqpacket 0660 system system
    writepid /dev/cpuset/system-background/tasks 

lmkd 会接纳 Framework 的命令,进行相应的操作:

功能 命令 对应方法
LMK_PROCPRIO 设置进程adj PL.setOomAdj()
LMK_TARGET 更新oom_adj PL.updateOomLevels()
LMK_PROCREMOVE 移除进程 PL.remove()

2.2 PL.setOomAdj

public static final void setOomAdj(int pid, int uid, int amt) {
    //当adj=16,则直接返回
    if (amt == UNKNOWN_ADJ)
        return;
    long start = SystemClock.elapsedRealtime();
    ByteBuffer buf = ByteBuffer.allocate(4 * 4);
    buf.putInt(LMK_PROCPRIO);
    buf.putInt(pid);
    buf.putInt(uid);
    buf.putInt(amt);
    //将16Byte字节写入socket【见小节2.3】
    writeLmkd(buf);
    long now = SystemClock.elapsedRealtime();
    if ((now-start) > 250) {
        Slog.w("ActivityManager", "SLOW OOM ADJ: " + (now-start) + "ms for pid " + pid
                + " = " + amt);
    }
}

buf大小为十四个字节,依次写入LMK_PROCPEscortIO(命令类型卡塔尔国, pid(进度pid卡塔尔国,
uid(进程uid卡塔尔(قطر‎, amt(指标adjState of Qatar,将那几个字节通过socket发送给lmkd.

lmkd socket 命令管理

static void ctrl_command_handler(void) {
    int ibuf[CTRL_PACKET_MAX / sizeof(int)];
    int len;
    int cmd = -1;
    int nargs;
    int targets;
    len = ctrl_data_read((char *)ibuf, CTRL_PACKET_MAX);
    if (len <= 0)
        return;
    nargs = len / sizeof(int) - 1;
    if (nargs < 0)
        goto wronglen;
    //将网络字节顺序转换为主机字节顺序
    cmd = ntohl(ibuf[0]);
    switch(cmd) {
    case LMK_TARGET:
        targets = nargs / 2;
        if (nargs & 0x1 || targets > (int)ARRAY_SIZE(lowmem_adj))
            goto wronglen;
        cmd_target(targets, &ibuf[1]);
        break;
    case LMK_PROCPRIO:
        if (nargs != 3)
            goto wronglen;
        //设置进程adj
        cmd_procprio(ntohl(ibuf[1]), ntohl(ibuf[2]), ntohl(ibuf[3]));
        break;
    case LMK_PROCREMOVE:
        if (nargs != 1)
            goto wronglen;
        cmd_procremove(ntohl(ibuf[1]));
        break;
    default:
        ALOGE("Received unknown command code %d", cmd);
        return;
    }
    return;
wronglen:
    ALOGE("Wrong control socket read length cmd=%d len=%d", cmd, len);
}

2.3 PL.writeLmkd

private static void writeLmkd(ByteBuffer buf) {
    //当socket打开失败会尝试3次
    for (int i = 0; i < 3; i++) {
        if (sLmkdSocket == null) {
                //打开socket 【见小节2.4】
                if (openLmkdSocket() == false) {
                    try {
                        Thread.sleep(1000);
                    } catch (InterruptedException ie) {
                    }
                    continue;
                }
        }
        try {
            //将buf信息写入lmkd socket
            sLmkdOutputStream.write(buf.array(), 0, buf.position());
            return;
        } catch (IOException ex) {
            try {
                sLmkdSocket.close();
            } catch (IOException ex2) {
            }
            sLmkdSocket = null;
        }
    }
}
  • 当sLmkdSocket为空,况兼张开失败,重新试行该操作;
  • 当sLmkdOutputStream写入buf音信退步,则会关闭sLmkdSocket,重新推行该操作;

其一重新实践操作最多3次,假若3次后还失利,则writeLmkd操作会直接停止。尝试3次,则无论结果什么都将脱离该操作,可以看到writeLmkd写入操作还只怕有望停业的。

安装进程 adj

static void cmd_procprio(int pid, int uid, int oomadj) {
    struct proc *procp;
    char path[80];
    char val[20];
    ...
    snprintf(path, sizeof(path), "/proc/%d/oom_score_adj", pid);
    snprintf(val, sizeof(val), "%d", oomadj);
    // 向节点/proc/<pid>/oom_score_adj写入oomadj
    writefilestring(path, val);

    // 当使用kernel方式则直接返回
    if (use_inkernel_interface)
        return;
    procp = pid_lookup(pid);
    if (!procp) {
            procp = malloc(sizeof(struct proc));
            if (!procp) {
                // Oh, the irony.  May need to rebuild our state.
                return;
            }
            procp->pid = pid;
            procp->uid = uid;
            procp->oomadj = oomadj;
            proc_insert(procp);
    } else {
        proc_unslot(procp);
        procp->oomadj = oomadj;
        proc_slot(procp);
    }
}

向节点/proc/<pid>/oom_score_adj写入oom_adj。由于use_inkernel_interface=1,那么再接到里要求探视
kernel 的景色。

小结:

use_inkernel_interface该值后续应该会慢慢采取顾客空间计谋。但是当下仍然是use_inkernel_interface=1则有:

  • LMK_TAHavalGET:AMS.updateConfiguration()的进度中调用updateOomLevels(State of Qatar方法,
    分别向/sys/module/lowmemorykiller/parameters目录下的minfree和adj节点写入相应音信;
  • LMK_PROCPEnclaveIO:
    AMS.applyOomAdjLocked(卡塔尔(قطر‎的进度中调用setOomAdj(卡塔尔(قطر‎,向/proc/<pid>/oom_score_adj写入oomadj
    后直接重返;
  • LMK_PROCREMOVE:AMS.handleAppDiedLocked可能AMS.cleanUpApplicationRecord洛克d(卡塔尔的历程,调用remove(卡塔尔国,前段时间不做其余事,直接再次来到;

2.4 PL.openLmkdSocket

private static boolean openLmkdSocket() {
    try {
        sLmkdSocket = new LocalSocket(LocalSocket.SOCKET_SEQPACKET);
        //与远程lmkd守护进程建立socket连接
        sLmkdSocket.connect(
            new LocalSocketAddress("lmkd",
                    LocalSocketAddress.Namespace.RESERVED));
        sLmkdOutputStream = sLmkdSocket.getOutputStream();
    } catch (IOException ex) {
        Slog.w(TAG, "lowmemorykiller daemon socket open failed");
        sLmkdSocket = null;
        return false;
    }
    return true;
}

sLmkdSocket采纳的是SOCK_SEQPACKET,那是种类的socket能提供顺序分明的,可信的,双向基于连接的socket
endpoint,与品类SOCK_澳门新葡萄京官网首页 ,STREAM很相仿,独一差别的是SEQPACKET保留音讯的分界,而SOCK_STREAM是基于字节流,并不会记录边界。

比喻:本地通过write(卡塔尔国系统调用向远程前后相继发送两组数据:一组4字节,一组8字节;对于SOCK_SEQPACKET类型通过read(卡塔尔(قطر‎能获悉那是两组数据以至大小,而对于SOCK_STREAM类型,通过read(卡塔尔国贰次性读取到11个字节,并不知道数据包的境界情况。

普及的数据类型还应该有SOCK_DGRAM,提供数据报情势,用于udp那样不可信赖的通讯进程。

再重临openLmkdSocket(卡塔尔(قطر‎方法,该方法是开辟三个名称叫lmkd的socket,类型为LocalSocket.SOCKET_SEQPACKET,这只是两个打包,真实类型便是SOCK_SEQPACKET。先跟远程lmkd守护进度创建连接,再向其通过write(State of Qatar将数据写入该socket,再接下去步向lmkd过程。

3.2 LowMemoryKiller Kernel driver

lowmemorykiller driver 位于
drivers/staging/android/lowmemorykiller.c

三. lmkd

lmkd是由init进度,通过深入解析init.rc文件来运维的lmkd守护进度,lmkd会创制名称为lmkd的socket,节点坐落于/dev/socket/lmkd,该socket用于跟上层framework交互。

service lmkd /system/bin/lmkd
    class core
    critical
    socket lmkd seqpacket 0660 system system
    writepid /dev/cpuset/system-background/tasks

lmkd运转后,接下里的操作都在platform/system/core/lmkd/lmkd.c文本,首先步入main(卡塔尔国方法

lowmemorykiller

static struct shrinker lowmem_shrinker = {
    .shrink = lowmem_shrink,
    .seeks = DEFAULT_SEEKS * 16
};

static int __init lowmem_init(void)
{
    register_shrinker(&lowmem_shrinker);
    vmpressure_notifier_register(&lmk_vmpr_nb);
    return 0;
}

static void __exit lowmem_exit(void)
{
    unregister_shrinker(&lowmem_shrinker);
}

通过 register_shrinkerunregister_shrinker分别用于开端化和退出。

3.1 main

int main(int argc __unused, char **argv __unused) {
    struct sched_param param = {
            .sched_priority = 1,
    };
    mlockall(MCL_FUTURE);
    sched_setscheduler(0, SCHED_FIFO, &param);
    //初始化【见小节3.2】
    if (!init())
        mainloop(); //成功后进入loop [见小节3.3]
    ALOGI("exiting");
    return 0;
}

shrinker

LMK驱动通过挂号 shrinker 来促成的,shrinker是linux
kernel规范的回收内部存款和储蓄器page的编制,由内核线程kswapd肩负督察。

当内部存款和储蓄器不足时kswapd线程会遍历一张shrinker链表,并回调已注册的shrinker函数来回笼内部存款和储蓄器page,kswapd还有恐怕会周期性唤醒来执行内存操作。种种zone维护active_list和inactive_list链表,内核依据页面活动状态将page在这里八个链表之间活动,最后经过shrink_slab和shrink_zone来回笼内存页。

3.2 init

static int init(void) {
    struct epoll_event epev;
    int i;
    int ret;
    page_k = sysconf(_SC_PAGESIZE);
    if (page_k == -1)
        page_k = PAGE_SIZE;
    page_k /= 1024;
    //创建epoll监听文件句柄
    epollfd = epoll_create(MAX_EPOLL_EVENTS);

    //获取lmkd控制描述符
    ctrl_lfd = android_get_control_socket("lmkd");
    //监听lmkd socket
    ret = listen(ctrl_lfd, 1);

    epev.events = EPOLLIN;
    epev.data.ptr = (void *)ctrl_connect_handler;

    //将文件句柄ctrl_lfd,加入epoll句柄
    if (epoll_ctl(epollfd, EPOLL_CTL_ADD, ctrl_lfd, &epev) == -1) {
        return -1;
    }

    maxevents++;
    //该路径是否具有可写的权限
    use_inkernel_interface = !access(INKERNEL_MINFREE_PATH, W_OK);
    if (use_inkernel_interface) {
        ALOGI("Using in-kernel low memory killer interface");
    } else {
        ret = init_mp(MEMPRESSURE_WATCH_LEVEL, (void *)&mp_event);
        if (ret)
            ALOGE("Kernel does not support memory pressure events or in-kernel low memory killer");
    }

    for (i = 0; i <= ADJTOSLOT(OOM_SCORE_ADJ_MAX); i++) {
        procadjslot_list[i].next = &procadjslot_list[i];
        procadjslot_list[i].prev = &procadjslot_list[i];
    }
    return 0;
}

那边,通过查证/sys/module/lowmemorykiller/parameters/minfree节点是不是富有可写权限来决断是或不是使用kernel接口来保管lmk事件。暗许该节点是全数系统可写的权柄,也就象征use_inkernel_interface=1.

lowmem_shrink

触发 shrink 操作:

static int lowmem_shrink(struct shrinker *s, struct shrink_control *sc)
{
    struct task_struct *tsk;
    struct task_struct *selected = NULL;
    int rem = 0;
    int tasksize;
    int i;
    int ret = 0;
    short min_score_adj = OOM_SCORE_ADJ_MAX + 1; //1001
    int minfree = 0;
    int selected_tasksize = 0;
    int selected_oom_score_adj;
    int array_size = ARRAY_SIZE(lowmem_adj);
    int other_free;
    int other_file;
    unsigned long nr_to_scan = sc->nr_to_scan;

    if (nr_to_scan > 0) {
        if (mutex_lock_interruptible(&scan_mutex) < 0)
            return 0;
    }

   // 剩余内存
    other_free = global_page_state(NR_FREE_PAGES);

    if (global_page_state(NR_SHMEM) + total_swapcache_pages <
        global_page_state(NR_FILE_PAGES))
        other_file = global_page_state(NR_FILE_PAGES) -
                        global_page_state(NR_SHMEM) -
                        total_swapcache_pages;
    else
        other_file = 0;

    tune_lmk_param(&other_free, &other_file, sc);

    if (lowmem_adj_size < array_size)
        array_size = lowmem_adj_size;
    if (lowmem_minfree_size < array_size)
        array_size = lowmem_minfree_size;
    for (i = 0; i < array_size; i++) {
        minfree = lowmem_minfree[i];
        if (other_free < minfree && other_file < minfree) {
            min_score_adj = lowmem_adj[i];
            break;
        }
    }
    if (nr_to_scan > 0) {
        ret = adjust_minadj(&min_score_adj);
        lowmem_print(3, "lowmem_shrink %lu, %x, ofree %d %d, ma %hdn",
                nr_to_scan, sc->gfp_mask, other_free,
                other_file, min_score_adj);
    }

    rem = global_page_state(NR_ACTIVE_ANON) +
        global_page_state(NR_ACTIVE_FILE) +
        global_page_state(NR_INACTIVE_ANON) +
        global_page_state(NR_INACTIVE_FILE);
    if (nr_to_scan <= 0 || min_score_adj == OOM_SCORE_ADJ_MAX + 1) {
        lowmem_print(5, "lowmem_shrink %lu, %x, return %dn",
                 nr_to_scan, sc->gfp_mask, rem);

        if (nr_to_scan > 0)
            mutex_unlock(&scan_mutex);

        if ((min_score_adj == OOM_SCORE_ADJ_MAX + 1) &&
            (nr_to_scan > 0))
            trace_almk_shrink(0, ret, other_free, other_file, 0);

        return rem;
    }
    selected_oom_score_adj = min_score_adj;

    rcu_read_lock();
    for_each_process(tsk) {
        struct task_struct *p;
        int oom_score_adj;

        if (tsk->flags & PF_KTHREAD)
            continue;

        /* if task no longer has any memory ignore it */
        if (test_task_flag(tsk, TIF_MM_RELEASED))
            continue;

        if (time_before_eq(jiffies, lowmem_deathpending_timeout)) {
            if (test_task_flag(tsk, TIF_MEMDIE)) {
                rcu_read_unlock();
                /* give the system time to free up the memory */
                msleep_interruptible(20);
                mutex_unlock(&scan_mutex);
                return 0;
            }
        }

        p = find_lock_task_mm(tsk);
        if (!p)
            continue;

        oom_score_adj = p->signal->oom_score_adj;
        // oom_adj 小于 最小值,忽略
        if (oom_score_adj < min_score_adj) {
            task_unlock(p);
            continue;
        }
        // 进程 RSS
        tasksize = get_mm_rss(p->mm);
        task_unlock(p);
        if (tasksize <= 0)
            continue;
        if (selected) {
            if (oom_score_adj < selected_oom_score_adj)
                continue;
            if (oom_score_adj == selected_oom_score_adj &&
                tasksize <= selected_tasksize)
                continue;
        }
        selected = p;
        selected_tasksize = tasksize;
        selected_oom_score_adj = oom_score_adj;
        lowmem_print(3, "select '%s' (%d), adj %hd, size %d, to killn",
                 p->comm, p->pid, oom_score_adj, tasksize);
    }
    if (selected) {
        lowmem_print(1, "Killing '%s' (%d), adj %d,n" 
                "   to free %ldkB on behalf of '%s' (%d) becausen" 
                "   cache %ldkB is below limit %ldkB for oom_score_adj %hdn" 
                "   Free memory is %ldkB above reserved.n" 
                "   Free CMA is %ldkBn" 
                "   Total reserve is %ldkBn" 
                "   Total free pages is %ldkBn" 
                "   Total file cache is %ldkBn" 
                "   Slab Reclaimable is %ldkBn" 
                "   Slab UnReclaimable is %ldkBn" 
                "   Total Slab is %ldkBn" 
                "   GFP mask is 0x%xn",
                 selected->comm, selected->pid,
                 selected_oom_score_adj,
                 selected_tasksize * (long)(PAGE_SIZE / 1024),
                 current->comm, current->pid,
                 other_file * (long)(PAGE_SIZE / 1024),
                 minfree * (long)(PAGE_SIZE / 1024),
                 min_score_adj,
                 other_free * (long)(PAGE_SIZE / 1024),
                 global_page_state(NR_FREE_CMA_PAGES) *
                (long)(PAGE_SIZE / 1024),
                 totalreserve_pages * (long)(PAGE_SIZE / 1024),
                 global_page_state(NR_FREE_PAGES) *
                (long)(PAGE_SIZE / 1024),
                 global_page_state(NR_FILE_PAGES) *
                (long)(PAGE_SIZE / 1024),
                 global_page_state(NR_SLAB_RECLAIMABLE) *
                (long)(PAGE_SIZE / 1024),
                 global_page_state(NR_SLAB_UNRECLAIMABLE) *
                (long)(PAGE_SIZE / 1024),
                 global_page_state(NR_SLAB_RECLAIMABLE) *
                (long)(PAGE_SIZE / 1024) +
                 global_page_state(NR_SLAB_UNRECLAIMABLE) *
                (long)(PAGE_SIZE / 1024),
                 sc->gfp_mask);

        if (lowmem_debug_level >= 2 && selected_oom_score_adj == 0) {
            show_mem(SHOW_MEM_FILTER_NODES);
            dump_tasks(NULL, NULL);
            show_mem_call_notifiers();
        }

        lowmem_deathpending_timeout = jiffies + HZ;
        send_sig(SIGKILL, selected, 0);
        set_tsk_thread_flag(selected, TIF_MEMDIE);
        rem -= selected_tasksize;
        rcu_read_unlock();
        /* give the system time to free up the memory */
        msleep_interruptible(20);
        trace_almk_shrink(selected_tasksize, ret,
            other_free, other_file, selected_oom_score_adj);
    } else {
        trace_almk_shrink(1, ret, other_free, other_file, 0);
        rcu_read_unlock();
    }

    lowmem_print(4, "lowmem_shrink %lu, %x, return %dn",
             nr_to_scan, sc->gfp_mask, rem);
    mutex_unlock(&scan_mutex);
    return rem;
}
  • 选择oom_score_adj最大的经过中,况兼rss内部存款和储蓄器最大的历程作为入选要杀的长河。
  • 杀进度方式:send_sig(SIGKILL, selected, 0)向选中的目的经过发送signal 9来杀掉指标经过。

3.3 mainloop

static void mainloop(void) {
    while (1) {
        struct epoll_event events[maxevents];
        int nevents;
        int i;
        ctrl_dfd_reopened = 0;

        //等待epollfd上的事件
        nevents = epoll_wait(epollfd, events, maxevents, -1);
        if (nevents == -1) {
            if (errno == EINTR)
                continue;
            continue;
        }
        for (i = 0; i < nevents; ++i) {
            if (events[i].events & EPOLLERR)
                ALOGD("EPOLLERR on event #%d", i);
            // 当事件到来,则调用ctrl_connect_handler方法 【见小节3.4】
            if (events[i].data.ptr)
                (*(void (*)(uint32_t))events[i].data.ptr)(events[i].events);
        }
    }
}

主循环调用epoll_wait(State of Qatar,等待epollfd上的平地风波,当接到到中断或许空头支票事件,则奉行continue操作。当事件驾临,则
调用的ctrl_connect_handler方法,该措施是由init(卡塔尔(قطر‎进程中设定的办法。

lmkd参数

  • oom_adj:代表经过的优先级, 数值越大,优先级越低,越轻易被杀.
    取值范围[-16, 15]
  • oom_score_adj: 取值范围[-1000, 1000]
  • oom_score:lmk攻略中貌似并从未阅览采取的地点,那么些应该是oom才会利用。

lowmem_oom_adj_to_oom_score_adj 计算:

static int lowmem_oom_adj_to_oom_score_adj(int oom_adj)
{
    if (oom_adj == OOM_ADJUST_MAX)
        return OOM_SCORE_ADJ_MAX;
    else
        return (oom_adj * OOM_SCORE_ADJ_MAX) / -OOM_DISABLE;
}
  • 当oom_adj = 15, 则 oom_score_adj = 1000;
  • 当oom_adj < 15, 则 oom_score_adj = oom_adj * 1000/17;

3.4 ctrl_connect_handler

static void ctrl_connect_handler(uint32_t events __unused) {
    struct epoll_event epev;
    if (ctrl_dfd >= 0) {
        ctrl_data_close();
        ctrl_dfd_reopened = 1;
    }
    ctrl_dfd = accept(ctrl_lfd, NULL, NULL);
    if (ctrl_dfd < 0) {
        ALOGE("lmkd control socket accept failed; errno=%d", errno);
        return;
    }
    ALOGI("ActivityManager connected");
    maxevents++;
    epev.events = EPOLLIN;
    epev.data.ptr = (void *)ctrl_data_handler;

    //将ctrl_lfd添加到epollfd
    if (epoll_ctl(epollfd, EPOLL_CTL_ADD, ctrl_dfd, &epev) == -1) {
        ALOGE("epoll_ctl for data connection socket failed; errno=%d", errno);
        ctrl_data_close();
        return;
    }
}

当事件触发,则调用ctrl_data_handler

四 总结

上述全数进度能够总结总括如下:

  • 系统 Framework 层依照分化门类进度生命周期调整,动态分配不一致的 adj
    值,而且在明确的机会会对持有进度的 adj 实行更新;
  • 更新 adj 时,Framework 层会和 lmkd 守护过程张开通讯,校订系统 lmk
    driver 配置的参数,同一时间安装 /proc/pid/oom_score_adj;
  • lowmemorykiller 驱动会被 linux 内核的内部存款和储蓄器 shrinker 机制调治,在
    shrinker 操作中,计算进度 adj 和 rss,依赖 driver 的 oom_adj 和
    minfree 配置,举行 kill 进度操作。

据此,后台应用被回笼的标题,需求优异关心:

  • 进度的生命周期及5大优先级分类
  • 减小内部存款和储蓄器占用,在 trimmemory 时能马上放出内部存款和储蓄器

3.5 ctrl_data_handler

static void ctrl_data_handler(uint32_t events) {
    if (events & EPOLLHUP) {
        //ActivityManager 连接已断开
        if (!ctrl_dfd_reopened)
            ctrl_data_close();
    } else if (events & EPOLLIN) {
        //[见小节3.6]
        ctrl_command_handler();
    }
}

参照他事他说加以调查文书档案:

  1. API Guide :Process and
    thread
  2. ActivityManagerService.java
  3. ProcessList.java
  4. lkm.c
  5. lowmemorykiller.c

3.6 ctrl_command_handler

static void ctrl_command_handler(void) {
    int ibuf[CTRL_PACKET_MAX / sizeof(int)];
    int len;
    int cmd = -1;
    int nargs;
    int targets;
    len = ctrl_data_read((char *)ibuf, CTRL_PACKET_MAX);
    if (len <= 0)
        return;
    nargs = len / sizeof(int) - 1;
    if (nargs < 0)
        goto wronglen;
    //将网络字节顺序转换为主机字节顺序
    cmd = ntohl(ibuf[0]);
    switch(cmd) {
    case LMK_TARGET:
        targets = nargs / 2;
        if (nargs & 0x1 || targets > (int)ARRAY_SIZE(lowmem_adj))
            goto wronglen;
        cmd_target(targets, &ibuf[1]);
        break;
    case LMK_PROCPRIO:
        if (nargs != 3)
            goto wronglen;
        //设置进程adj【见小节3.7】
        cmd_procprio(ntohl(ibuf[1]), ntohl(ibuf[2]), ntohl(ibuf[3]));
        break;
    case LMK_PROCREMOVE:
        if (nargs != 1)
            goto wronglen;
        cmd_procremove(ntohl(ibuf[1]));
        break;
    default:
        ALOGE("Received unknown command code %d", cmd);
        return;
    }
    return;
wronglen:
    ALOGE("Wrong control socket read length cmd=%d len=%d", cmd, len);
}

CTRL_PACKET_MAX 大小相等 (sizeof(int卡塔尔 * (MAX_TARGETS * 2 +
1));而MAX_TARGETS=6,对于sizeof(int)=4的系统,则CTRL_PACKET_MAX=52。
获取framework传递过来的buf数据后,依据3种分化的授命,步向分裂的分支。
接下来,继续早前面传递过来的LMK_PROCPRIO一声令下来往下解说,步入cmd_procprio过程。

3.7 cmd_procprio

static void cmd_procprio(int pid, int uid, int oomadj) {
    struct proc *procp;
    char path[80];
    char val[20];
    ...
    snprintf(path, sizeof(path), "/proc/%d/oom_score_adj", pid);
    snprintf(val, sizeof(val), "%d", oomadj);
    //向节点/proc/<pid>/oom_score_adj写入oomadj
    writefilestring(path, val);

    //当使用kernel方式则直接返回
    if (use_inkernel_interface)
        return;
    procp = pid_lookup(pid);
    if (!procp) {
            procp = malloc(sizeof(struct proc));
            if (!procp) {
                // Oh, the irony.  May need to rebuild our state.
                return;
            }
            procp->pid = pid;
            procp->uid = uid;
            procp->oomadj = oomadj;
            proc_insert(procp);
    } else {
        proc_unslot(procp);
        procp->oomadj = oomadj;
        proc_slot(procp);
    }
}

向节点“/proc//oom_score_adj`写入oomadj。由于use_inkernel_interface=1,那么再收取里须要探视kernel的事态

3.8 小节

use_inkernel_interface该值后续应该会日益选择客商空间战略。可是当下仍是use_inkernel_interface=1则有:

  • LMK_PROCPRIO: 向/proc/<pid>/oom_score_adj写入oomadj,则直接回到;
  • LMK_PROCREMOVE:不做别的事,直接回到;
  • LMK_TARGET:分别向/sys/module/lowmemorykiller/parameters目录下的minfreeadj节点写入相应消息;

四. Kernel层

lowmemorykiller driver位于 drivers/staging/Android/lowmemorykiller.c

4.1 lowmemorykiller初始化

static struct shrinker lowmem_shrinker = {
    .scan_objects = lowmem_scan,
    .count_objects = lowmem_count,
    .seeks = DEFAULT_SEEKS * 16
};

static int __init lowmem_init(void)
{
    register_shrinker(&lowmem_shrinker);
    return 0;
}

static void __exit lowmem_exit(void)
{
    unregister_shrinker(&lowmem_shrinker);
}

module_init(lowmem_init);
module_exit(lowmem_exit);

通过register_shrinker和unregister_shrinker分别用于先导化和分离。

4.2 shrinker

LMK驱动通过注册shrinker来完结的,shrinker是linux
kernel标准的回笼内部存款和储蓄器page的体制,由内核线程kswapd担负监督。

当内部存款和储蓄器不足时kswapd线程会遍历一张shrinker链表,并回调已登记的shrinker函数来回笼内部存款和储蓄器page,kswapd还有只怕会周期性唤醒来试行内部存款和储蓄器操作。每一个zone维护active_list和inactive_list链表,内核依照页面活动状态将page在此五个链表之间活动,最后经过shrink_slab和shrink_zone来回收内存页,风野趣想进一层询问linux内部存储器回收机制,可自行钻研,这里再再次来到LowMemoryKiller的历程深入分析。

4.3 lowmem_count

static unsigned long lowmem_count(struct shrinker *s,
                  struct shrink_control *sc)
{
    return global_page_state(NR_ACTIVE_ANON) +
        global_page_state(NR_ACTIVE_FILE) +
        global_page_state(NR_INACTIVE_ANON) +
        global_page_state(NR_INACTIVE_FILE);
}

ANON代表无名映射,未有后备存款和储蓄器;FILE代表文件映射; 内部存款和储蓄器计算公式=
活动无名内部存储器 + 活动文件内部存款和储蓄器 + 不移步佚名内部存款和储蓄器 + 不移动文件内部存储器

4.4 lowmem_scan

当触发lmkd,则先杀oom_adj最大的历程,
当oom_adj相等时,则选择oom_score_adj最大的进度.

static unsigned long lowmem_scan(struct shrinker *s, struct shrink_control *sc)
{
    struct task_struct *tsk;
    struct task_struct *selected = NULL;
    unsigned long rem = 0;
    int tasksize;
    int i;
    short min_score_adj = OOM_SCORE_ADJ_MAX + 1;
    int minfree = 0;
    int selected_tasksize = 0;
    short selected_oom_score_adj;
    int array_size = ARRAY_SIZE(lowmem_adj);
  //获取当前剩余内存大小
    int other_free = global_page_state(NR_FREE_PAGES) - totalreserve_pages;
    int other_file = global_page_state(NR_FILE_PAGES) -
                        global_page_state(NR_SHMEM) -
                        total_swapcache_pages();
  //获取数组大小
    if (lowmem_adj_size < array_size)
        array_size = lowmem_adj_size;
    if (lowmem_minfree_size < array_size)
        array_size = lowmem_minfree_size;

  //遍历lowmem_minfree数组找出相应的最小adj值
    for (i = 0; i < array_size; i++) {
        minfree = lowmem_minfree[i];
        if (other_free < minfree && other_file < minfree) {
            min_score_adj = lowmem_adj[i];
            break;
        }
    }

    if (min_score_adj == OOM_SCORE_ADJ_MAX + 1) {
        return 0;
    }
    selected_oom_score_adj = min_score_adj;

    rcu_read_lock();
    for_each_process(tsk) {
        struct task_struct *p;
        short oom_score_adj;
        if (tsk->flags & PF_KTHREAD)
            continue;
        p = find_lock_task_mm(tsk);
        if (!p)
            continue;
        if (test_tsk_thread_flag(p, TIF_MEMDIE) &&
            time_before_eq(jiffies, lowmem_deathpending_timeout)) {
            task_unlock(p);
            rcu_read_unlock();
            return 0;
        }
        oom_score_adj = p->signal->oom_score_adj;
    //小于目标adj的进程,则忽略
        if (oom_score_adj < min_score_adj) {
            task_unlock(p);
            continue;
        }
    //获取的是进程的Resident Set Size,也就是进程独占内存 + 共享库大小。
        tasksize = get_mm_rss(p->mm);
        task_unlock(p);
        if (tasksize <= 0)
            continue;

    //算法关键,选择oom_score_adj最大的进程中,并且rss内存最大的进程.
        if (selected) {
            if (oom_score_adj < selected_oom_score_adj)
                continue;
            if (oom_score_adj == selected_oom_score_adj &&
                tasksize <= selected_tasksize)
                continue;
        }
        selected = p;
        selected_tasksize = tasksize;
        selected_oom_score_adj = oom_score_adj;
        lowmem_print(2, "select '%s' (%d), adj %hd, size %d, to killn",
                 p->comm, p->pid, oom_score_adj, tasksize);
    }

    if (selected) {
        long cache_size = other_file * (long)(PAGE_SIZE / 1024);
        long cache_limit = minfree * (long)(PAGE_SIZE / 1024);
        long free = other_free * (long)(PAGE_SIZE / 1024);

        lowmem_deathpending_timeout = jiffies + HZ;
        set_tsk_thread_flag(selected, TIF_MEMDIE);
    //向选中的目标进程发送signal 9来杀掉目标进程
        send_sig(SIGKILL, selected, 0);
        rem += selected_tasksize;
    }
    rcu_read_unlock();
    return rem;
}
  • 选择oom_score_adj最大的历程中,而且rss内部存款和储蓄器最大的长河作为入选要杀的长河。
  • 杀进度方式:send_sig(SIGKILL, selected, 0)`向选中的对象经过发送signal
    9来杀掉指标经过。

另外,lowmem_minfree[]和lowmem_adj[]数组大小个数为6,通过如下两条命令:

module_param_named(debug_level, lowmem_debug_level, uint, S_IRUGO | S_IWUSR);    
module_param_array_named(adj, lowmem_adj, short, &lowmem_adj_size, S_IRUGO | S_IWUSR);

当如下节点数据发送变化时,会通过改革lowmem_minfree[]和lowmem_adj[]数组:

/sys/module/lowmemorykiller/parameters/minfree
/sys/module/lowmemorykiller/parameters/adj

五、总结

正文首要从frameworks的ProcessList.java调解adj,通过socket通讯将事件发送给native的料理进度lmkd;lmkd再依据现实的授命来试行相应操作,其根本成效更新过程的oom_score_adj值以及lowmemorykiller驱动的parameters(包括minfree和adj);

最终讲到了lowmemorykiller驱动,通过注册shrinker,依据linux标准的内部存款和储蓄器回笼机制,依照当下系统可用内部存款和储蓄器以致parameters配置参数(adj,minfree卡塔尔(قطر‎来抉择合适的selected_oom_score_adj,再从有着进度中筛选adj大于该指标值的还要占用rss内部存储器最大的进程,将其杀死,进而释放出内部存款和储蓄器。

5.1 lmkd参数:

  • oom_adj:代表经过的优先级, 数值越大,优先级越低,越轻松被杀.
    取值范围[-16, 15]
  • oom_score_adj: 取值范围[-1000, 1000]
  • oom_score:lmk计谋中平时并不曾观望使用的地点,那么些理应是oom才会采取。

想查看有些进度的上述3值,只供给知道pid,查看以下多少个节点:

/proc/<pid>/oom_adj
/proc/<pid>/oom_score_adj
/proc/<pid>/oom_score

对于oom_adj与oom_score_adj有早晚的照射关系:

  • 当oom_adj = 15, 则oom_score_adj=1000;
  • 当oom_adj < 15, 则oom_score_adj= oom_adj * 1000/17;

5.2 driver参数

/sys/module/lowmemorykiller/parameters/minfree (代表page个数)
/sys/module/lowmemorykiller/parameters/adj (代表oom_score_adj)

例如:将1,6写入节点/sys/module/lowmemorykiller/parameters/adj,将1024,8192写入节点/sys/module/lowmemorykiller/parameters/minfree。攻略:当系统可用内部存款和储蓄器低于8192个pages时,则会杀掉oom_score_adj>=6的历程;当系统可用内部存款和储蓄器低于1024个pages时,则会杀掉oom_score_adj>=1的进程。

发表评论

电子邮件地址不会被公开。 必填项已用*标注