YON CISE

JVM 源码阅读指北

你有没有好奇过字节码究竟是怎么在 JVM 里执行的?你有没有因为网上查的关于 JVM 的资料相互之间说法矛盾而不知道该相信谁?你有没有曾经想要看 JVM 源码但不知道从何下手?

这篇文章会带你从头开始编译 JVM,告诉你如何在 IDE 里进行 Debug,并会简单过下 JVM 中 字节码执行、垃圾回收 以及 即时编译 的相关流程。

不用担心自己不熟悉 C++ 和 汇编 而看不懂 JVM 的源码。只要你熟悉 Java,有一点点 C 的基础,那么 “看懂” C++ 代码就没有太大问题,遇到一些不熟悉的语法直接问 ChatGPT 就行了。

环境准备

为了屏蔽底层操作系统的相关差异,我们会使用 Docker 来进行编译,然后在容器里运行程序,最后用 IDE + gdbserver 的方式进行远程调试。

我的环境情况如下:

  1. macOS Sonoma 14.4.1
    1. CPU M1 Pro
    2. 内存 32GB
  2. CLion 2024.2.2
  3. OpenJDK 11.0.0.2
    1. 下载地址 https://jdk.java.net/java-se-ri/11-MR3
  4. Docker 镜像
FROM ubuntu:18.04

RUN apt update && \
    apt install -y vim gdb \
    build-essential file unzip zip \
    # JDK 编译需要 Boot JDK,编译 11 需要 10 以上的版本
    openjdk-11-jdk \
    libx11-dev libxext-dev libxrender-dev libxtst-dev libxt-dev \
    libcups2-dev libfontconfig1-dev libasound2-dev

CMD bash

编译

下面开始进行 JDK 的编译:

  1. 解压下载下来的 openjdk,假设解压到 ~/Downloads/openjdk
  2. 构建 Docker 镜像 docker build -t jdkbuilder .
  3. 运行镜像 docker run -v ~/Downloads/openjdk:/openjdk -it -p 1234:1234 jdkbuilder
    1. 1234 端口是为了后面进行远程 debug
  4. 在 Docker 里进行构建
cd /openjdk
bash configure --with-jvm-variants=server \
     --disable-warnings-as-errors \
     --with-debug-level=slowdebug \
     --with-native-debug-symbols=internal
# 编译整个 JDK
make images

在给 Docker 分配 10 核 10G 的情况下,完整编译整个耗时大概是 7 分钟。

现在我们就可以运行我们自己编译的 JDK 了

/openjdk/build/linux-aarch64-normal-server-slowdebug/jdk/bin/java -version

输出结果如下:

openjdk version "11.0.0.2-internal" 2024-07-02
OpenJDK Runtime Environment (slowdebug build 11.0.0.2-internal+0-adhoc..openjdk)
OpenJDK 64-Bit Server VM (slowdebug build 11.0.0.2-internal+0-adhoc..openjdk, mixed mode)

调试

我们先通过 gdb 找到程序的入口:

$ gdb java
(gdb) b main
Breakpoint 1 at 0xe98: file /openjdk/src/java.base/share/native/launcher/main.c, line 98.
(gdb) 

成功找到入口文件:/openjdk/src/java.base/share/native/launcher/main.c,在 CLion 里找到 main 函数下个断点。

为了后面调试的方便(主要就是忽略一些操作系统的中断),我们在本机先配置下 gdb,新建 ~/.gdbinit文件,内容如下:

handle SIGILL pass noprint nostop
handle SIGSEGV pass noprint nostop

接着我们在容器里运行 gdbserver:

cd /openjdk/build/linux-aarch64-normal-server-slowdebug/jdk/bin/
gdbserver :1234 ./java -version

下面在 CLion 进行连接,具体设置可以参考 The Remote Debug configuration,关键要设置的参数是:

  1. Debugger 里选择 GDB
  2. ‘target remote’ args 填 127.0.0.1:1234
  3. Path mappings 里 Remote 填容器里的 openjdk 地址 /openjdk,Local 填你本机的 openjdk 的绝对地址

运行后,不出意外程序就会成功停在下断点的地方了。

引子

上面我们一直以 java -version来做例子,那我们就先看下它背后究竟是怎么运行的。简单看下代码,不用太关心细节,前面主要就是些参数的处理,main 函数里的核心是:

// main.c
return JLI_Launch(margc, margv,
               jargc, (const char**) jargv,
               0, NULL,
               VERSION_STRING,
               DOT_VERSION,
               (const_progname != NULL) ? const_progname : *margv,
               (const_launcher != NULL) ? const_launcher : *margv,
               jargc > 0,
               const_cpwildcard, const_javaw, 0);

在 IDE 里跟进去,来到 src/java.base/share/native/libjli/java.c,同样的,忽略细节,我们大概看下代码,根据命名可以看到里面做了些 JVM 加载的事情,最后来到:

// java.c
return JVMInit(&ifn, threadStackSize, argc, argv, mode, what, ret);

这里说下,因为我们并没有去配置 CLion 的环境,所以这时候你如果想要在 IDE 里跟进去,应该只会跳到头文件里,而看不到方法的定义,这时候我们就只能用搜索大法了。以 JVMInit为关键字,我们可以看到在一些名字形如 java_md_*.c 的文件里定义了这个函数,有多个定义是因为里面有平台相关的实现(md 是 machine-dependent 的意思) OpenJDK 需要针对不同的操作系统进行实现,我们以 linux 平台的实现 java_md_solinux.c来看下

// java_md_solinux.c
int
JVMInit(InvocationFunctions* ifn, jlong threadStackSize,
        int argc, char **argv,
        int mode, char *what, int ret)
{
    ShowSplashScreen();
    return ContinueInNewThread(ifn, threadStackSize, argc, argv, mode, what, ret);
}

最终来到 ContinueInNewThread0函数:

// java_md_solinux.c
rslt = ContinueInNewThread0(JavaMain, threadStackSize, (void*)&args);

里面的实现你可以看到就是调用 linux 的方法启动了一个操作系统线程,然后去运行了 JavaMain这个方法。对于 Java 程序员来说可能会觉得这里有点奇怪,为什么能传一个函数名?这是因为 C/C++ 里函数就是一个指针,指向一段内存里的代码区域,启动一个线程无非就是去运行那段内存里的代码。我们回到 JavaMain这个函数,可以看到:

// java.c
if (printVersion || showVersion) {
    PrintJavaVersion(env, showVersion);
    CHECK_EXCEPTION_LEAVE(0);
    if (printVersion) {
        LEAVE();
    }
}

进入 PrintJavaVersion函数:

// java.c
/*
 * Prints the version information from the java.version and other properties.
 */
static void
PrintJavaVersion(JNIEnv *env, jboolean extraLF)
{
    jclass ver;
    jmethodID print;

    NULL_CHECK(ver = FindBootStrapClass(env, "java/lang/VersionProps"));
    NULL_CHECK(print = (*env)->GetStaticMethodID(env,
                                                 ver,
                                                 (extraLF == JNI_TRUE) ? "println" : "print",
                                                 "(Z)V"
                                                 )
              );

    (*env)->CallStaticVoidMethod(env, ver, print, printTo);
}

根据函数名大概猜下,java -version最终实际运行的是 java.lang.VersionProps这个类的 println方法。

到这里你可能会有很多问题,比如 JNIEnv *env是怎么来的?CallStaticVoidMethod内部又是怎么运行的?我们先来看看 JNIEnv *env是怎么来的,也就是 JVM 是怎么初始化的。

JVM 初始化

回到 JavaMain函数,我们可以看到这段代码:

// java.c
/* Initialize the virtual machine */
start = CounterGet();
if (!InitializeJVM(&vm, &env, &ifn)) {
    JLI_ReportErrorMessage(JVM_ERROR1);
    exit(1);
}

进入 InitializeJVM

// java.c
/*
 * Initializes the Java Virtual Machine. Also frees options array when
 * finished.
 */
static jboolean
InitializeJVM(JavaVM **pvm, JNIEnv **penv, InvocationFunctions *ifn)
{
    JavaVMInitArgs args;
    jint r;

    memset(&args, 0, sizeof(args));
    args.version  = JNI_VERSION_1_2;
    args.nOptions = numOptions;
    args.options  = options;
    args.ignoreUnrecognized = JNI_FALSE;

    if (JLI_IsTraceLauncher()) {
        int i = 0;
        printf("JavaVM args:\n    ");
        printf("version 0x%08lx, ", (long)args.version);
        printf("ignoreUnrecognized is %s, ",
               args.ignoreUnrecognized ? "JNI_TRUE" : "JNI_FALSE");
        printf("nOptions is %ld\n", (long)args.nOptions);
        for (i = 0; i < numOptions; i++)
            printf("    option[%2d] = '%s'\n",
                   i, args.options[i].optionString);
    }
    r = ifn->CreateJavaVM(pvm, (void **)penv, &args);
    JLI_MemFree(options);
    return r == JNI_OK;
}

关键是 ifn->CreateJavaVM(pvm, (void **)penv, &args);这段代码,但是 CreateJavaVMInvocationFunctions 这个结构体的一个属性,所以要找到它对应的函数定义需要我们找到 ifn是在哪里初始化的。

回到 java.cJLI_Launch函数,我们可以看到这段代码:

// java.c
if (!LoadJavaVM(jvmpath, &ifn)) {
    return(6);
}

LoadJavaVM又是一个平台相关的函数,我们可以在 java_md_solinux.c里找到定义,里面有一段这个代码:

// java.c
ifn->CreateJavaVM = (CreateJavaVM_t)
    dlsym(libjvm, "JNI_CreateJavaVM");
if (ifn->CreateJavaVM == NULL) {
    JLI_ReportErrorMessage(DLL_ERROR2, jvmpath, dlerror());
    return JNI_FALSE;
}

这里用 linux 的动态链接的方式调用了 JNI_CreateJavaVM这个方法,通过搜索大法我们找到它的定义在 src/hotspot/share/prims/jni.cpp(看下路径,这里我们已经进入 HotSpot 的代码了),里面代码很简单,核心代码是:

// jni.cpp
result = JNI_CreateJavaVM_inner(vm, penv, args);

JNI_CreateJavaVM_inner里面就是真正的 JVM 初始化代码里,还记得我们最初的目的吗?JNIEnv *env是怎么来的。我们可以找到这么一段代码:

// jni.cpp
result = Threads::create_vm((JavaVMInitArgs*) args, &can_try_again);
if (result == JNI_OK) {
  JavaThread *thread = JavaThread::current();
  assert(!thread->has_pending_exception(), "should have returned not OK");
  /* thread is thread_in_vm here */
  *vm = (JavaVM *)(&main_vm);
  *(JNIEnv**)penv = thread->jni_environment();

至此,JVM 的初始化就完成了。

字节码执行

前面 java -version最后运行到了 (*env)->CallStaticVoidMethod(env, ver, print, printTo);,执行一个静态方法就会涉及到字节码的执行了,我们来看下CallStaticVoidMethod的实现。

先看下 thread->jni_environment();的实现:

// jni.h
#ifdef __cplusplus
typedef JNIEnv_ JNIEnv;
#else
typedef const struct JNINativeInterface_ *JNIEnv;
#endif
// thread.hpp
JNIEnv        _jni_environment;
JNIEnv* jni_environment()                      { return &_jni_environment; }

因为 HotSpot 是 C++ 实现的,所以我们看下 JNIEnv_ 的定义,里面可以看到:

// jni.h
/*
 * We use inlined functions for C++ so that programmers can write:
 *
 *    env->FindClass("java/lang/String")
 *
 * in C++ rather than:
 *
 *    (*env)->FindClass(env, "java/lang/String")
 *
 * in C.
 */
struct JNIEnv_ {
    const struct JNINativeInterface_ *functions;
#ifdef __cplusplus

    // ...
    void CallStaticVoidMethod(jclass cls, jmethodID methodID, ...) {
        va_list args;
        va_start(args,methodID);
        functions->CallStaticVoidMethodV(this,cls,methodID,args);
        va_end(args);
    }
    // ...
#endif /* __cplusplus */
}

这里你可能会发现方法的入参和前面我们看到的调用的地方不一样,第一个入参不是 env,看下源码里的注释,我们会知道是因为我们之前的调用是在 C 里面发起的,C 的环境下 JNIEnv 的定义是 JNINativeInterface_,里面关于 CallStaticVoidMethod的定义是:

// jni.h
struct JNINativeInterface_ {
    //...
    void (JNICALL *CallStaticVoidMethod)
      (JNIEnv *env, jclass cls, jmethodID methodID, ...);
    //...
}

继续看 functions->CallStaticVoidMethodV(this,cls,methodID,args);,找到 JNINativeInterface_定义,在 jni.cpp文件里, 最终 CallStaticVoidMethodV的实现对应的是jni_CallStaticVoidMethodV这个方法 :

// jni.cpp
JNI_ENTRY(void, jni_CallStaticVoidMethodV(JNIEnv *env, jclass cls, jmethodID methodID, va_list args))
  JNIWrapper("CallStaticVoidMethodV");
  HOTSPOT_JNI_CALLSTATICVOIDMETHODV_ENTRY(env, cls, (uintptr_t) methodID);
  DT_VOID_RETURN_MARK(CallStaticVoidMethodV);

  JavaValue jvalue(T_VOID);
  JNI_ArgumentPusherVaArg ap(methodID, args);
  jni_invoke_static(env, &jvalue, NULL, JNI_STATIC, methodID, &ap, CHECK);
JNI_END

jni_invoke_static里调用 JavaCalls::call(result, method, &java_args, CHECK);,对应方法定义在 javaCalls.cpp里:

// javaCalls.cpp
void JavaCalls::call(JavaValue* result, const methodHandle& method, JavaCallArguments* args, TRAPS) {
  // Check if we need to wrap a potential OS exception handler around thread
  // This is used for e.g. Win32 structured exception handlers
  assert(THREAD->is_Java_thread(), "only JavaThreads can make JavaCalls");
  // Need to wrap each and every time, since there might be native code down the
  // stack that has installed its own exception handlers
  os::os_exception_wrapper(call_helper, result, method, args, THREAD);
}

os::os_exception_wrapper又是个平台相关的函数,linux 下就是直接调用传进来的javaCalls.cpp里的 call_helper函数,里面真正的发起调用的代码是:

// javaCalls.cpp
// do call
{ JavaCallWrapper link(method, receiver, result, CHECK);
  { HandleMark hm(thread);  // HandleMark used by HandleMarkCleaner

    StubRoutines::call_stub()(
      (address)&link,
      // (intptr_t*)&(result->_value), // see NOTE above (compiler problem)
      result_val_address,          // see NOTE above (compiler problem)
      result_type,
      method(),
      entry_point,
      args->parameters(),
      args->size_of_parameters(),
      CHECK
    );

    result = link.result();  // circumvent MS C++ 5.0 compiler bug (result is clobbered across call)
    // Preserve oop return value across possible gc points
    if (oop_result_flag) {
      thread->set_vm_result((oop) result->get_jobject());
    }
  }
} // Exit JavaCallWrapper (can block - potential return oop must be preserved)

StubRoutines::call_stub返回了一个 cpu 架构相关的函数,主要是做一些寄存器、栈的准备(这块没太深入研究),是以类似写汇编指令的形式在运行时动态往内存里写机器指令实现的(x86 下是在 stubGenerator_x86_64.cppgenerate_call_stub里),不过简单看下代码,可以发现最终是调用的 entry_point指向的代码:

// stubGenerator_x86_64.cpp
address start = __ pc();
// ...
// C++ 里栈上分配对象
const Address entry_point   (rbp, entry_point_off    * wordSize);
// ...
// call Java function
__ BIND(parameters_done);
__ movptr(rbx, method);             // get Method*
__ movptr(c_rarg1, entry_point);    // get entry_point
__ mov(r13, rsp);                   // set sender sp
BLOCK_COMMENT("call Java function");
__ call(c_rarg1);
// ...
return start;

entry_point指向的是 address entry_point = method->from_interpreted_entry();method怎么来的,我们先忽略,就看看 from_interpreted_entry()是什么,很简单,就是返回的method_from_interpreted_entry属性,通过搜索大法,我们发现它是这么被设置的:

// method.hpp
void set_interpreter_entry(address entry) {
  assert(!is_shared(), "shared method's interpreter entry should not be changed at run time");
  if (_i2i_entry != entry) {
    _i2i_entry = entry;
  }
  if (_from_interpreted_entry != entry) {
    _from_interpreted_entry = entry;
  }
}

// method.cpp

// Called when the method_holder is getting linked. Setup entrypoints so the method
// is ready to be called from interpreter, compiler, and vtables.
void Method::link_method(const methodHandle& h_method, TRAPS) {
  // ...
  if (!is_shared()) {
    assert(adapter() == NULL, "init'd to NULL");
    address entry = Interpreter::entry_for_method(h_method);
    assert(entry != NULL, "interpreter entry must be non-null");
    // Sets both _i2i_entry and _from_interpreted_entry
    set_interpreter_entry(entry);
  }
  // ...
}

// abstractInterpreter.hpp
static address    entry_for_kind(MethodKind k)                { assert(0 <= k && k < number_of_method_entries, "illegal kind"); return _entry_table[k]; }
static address    entry_for_method(const methodHandle& m)     { return entry_for_kind(method_kind(m)); }

可以看到就是基于方法类型从_entry_table来获取一个地址,_entry_table怎么初始化的呢?依旧是搜索,我们可以看到是在templateInterpreterGenerator.cpp里的 TemplateInterpreterGenerator::generate_all生成的:

// templateInterpreterGenerator.cpp
#define method_entry(kind)                                              \
  { CodeletMark cm(_masm, "method entry point (kind = " #kind ")"); \
    Interpreter::_entry_table[Interpreter::kind] = generate_method_entry(Interpreter::kind); \
    Interpreter::update_cds_entry_table(Interpreter::kind); \
  }

看名字可以知道,这就是我们知道的模版解析器的代码了,模版解释器是把每个字节码对应到一段机器指令的,所以generate_method_entry里调用的generate_normal_entry方法又是一个和 cpu 架构相关的实现了,比如 x86 对应的是 templateInterpreterGenerator_x86.cpp。这里面也是往内存里动态写指令实现的,要看懂需要一些汇编的知识。不过我们不用太关心细节,可以结合一些网上关于模版解释器的介绍,我们可以猜想里面关于 dispatch 的代码是我们要关注的:

// templateInterpreterGenerator_x86.cpp
address TemplateInterpreterGenerator::generate_normal_entry(bool synchronized) {
    // ...
    __ dispatch_next(vtos);
    // ...
}

// interp_masm_x86.cpp
void InterpreterMacroAssembler::dispatch_next(TosState state, int step, bool generate_poll) {
  // load next bytecode (load before advancing _bcp_register to prevent AGI)
  load_unsigned_byte(rbx, Address(_bcp_register, step));
  // advance _bcp_register
  increment(_bcp_register, step);
  dispatch_base(state, Interpreter::dispatch_table(state), true, generate_poll);
}

注意到 Interpreter::dispatch_table(state),相关的实现如下:

// templateInterpreter.hpp
static address*   dispatch_table(TosState state)              { return _active_table.table_for(state); }

// templateInterpreter.cpp
void TemplateInterpreter::initialize() {
  // ...
  // initialize dispatch table
  _active_table = _normal_table;
}

// templateInterpreterGenerator.cpp
void TemplateInterpreterGenerator::set_entry_points(Bytecodes::Code code) {
  // ...
  // code for short & wide version of bytecode
  if (Bytecodes::is_defined(code)) {
    Template* t = TemplateTable::template_for(code);
    assert(t->is_valid(), "just checking");
    set_short_entry_points(t, bep, cep, sep, aep, iep, lep, fep, dep, vep);
  }
  if (Bytecodes::wide_is_defined(code)) {
    Template* t = TemplateTable::template_for_wide(code);
    assert(t->is_valid(), "just checking");
    set_wide_entry_point(t, wep);
  }
  // set entry points
  EntryPoint entry(bep, zep, cep, sep, aep, iep, lep, fep, dep, vep);
  Interpreter::_normal_table.set_entry(code, entry);
  Interpreter::_wentry_point[code] = wep;
}

// templateTable.hpp
// Templates
static Template        _template_table     [Bytecodes::number_of_codes];
static Template* template_for     (Bytecodes::Code code)  { Bytecodes::check     (code); return &_template_table     [code]; }
static Template* template_for_wide(Bytecodes::Code code)  { Bytecodes::wide_check(code); return &_template_table_wide[code]; }

其中 _template_tableTemplateTable这个类的静态属性,它的类型是 Template 的数组,在 C++ 里,数组类型初始化的时候就会把对应的对象的内存都分配了,Java 程序员熟悉的数组对应到 C++ 里其实是 对象指针的数组。_template_table里对象的数据初始化代码是在 templateTable.cpp里的 TemplateTable::initialize()函数里:

// templateTable.cpp
void TemplateTable::initialize() {
  if (_is_initialized) return;
  // ...
  // Java spec bytecodes                ubcp|disp|clvm|iswd  in    out   generator             argument
  def(Bytecodes::_invokevirtual       , ubcp|disp|clvm|____, vtos, vtos, invokevirtual       , f2_byte      );
  def(Bytecodes::_invokespecial       , ubcp|disp|clvm|____, vtos, vtos, invokespecial       , f1_byte      );
  def(Bytecodes::_invokestatic        , ubcp|disp|clvm|____, vtos, vtos, invokestatic        , f1_byte      );
  def(Bytecodes::_invokeinterface     , ubcp|disp|clvm|____, vtos, vtos, invokeinterface     , f1_byte      );
  def(Bytecodes::_invokedynamic       , ubcp|disp|clvm|____, vtos, vtos, invokedynamic       , f1_byte      );
  def(Bytecodes::_new                 , ubcp|____|clvm|____, vtos, atos, _new                ,  _           );
  // ...
}

每个字节码对应的具体实现只要跳转到对应的 generator 就可以看到了。

简单总结下 Java 字节码的执行流程,类加载后,每个类的方法会基于方法类型关联到一个动态初始化的代码,里面会基于字节码去 TemplateTable 查找对应的指令执行。TemplateTable 是在 JVM 初始化时构建的。

垃圾回收

垃圾回收有很多情况会被触发和设置的垃圾回收器也有关,我们这里就考虑使用 G1 回收器的情况下 new 对象时因为空间不足触发垃圾回收的这种情况。我们不会涉及到 GC 的具体逻辑,只是梳理下 GC 被触发的流程。

基于上面介绍的字节码执行的知识,我们现在要看 new 对应的字节码的实现,相关代码如下:

// templateTable_x86.cpp
void TemplateTable::_new() {
  //...
  Register rarg1 = LP64_ONLY(c_rarg1) NOT_LP64(rax);
  Register rarg2 = LP64_ONLY(c_rarg2) NOT_LP64(rdx);

  __ get_constant_pool(rarg1);
  __ get_unsigned_2_byte_index_at_bcp(rarg2, 1);
  call_VM(rax, CAST_FROM_FN_PTR(address, InterpreterRuntime::_new), rarg1, rarg2);
   __ verify_oop(rax);

  // continue
  __ bind(done);
}

// interpreterRuntime.cpp
IRT_ENTRY(void, InterpreterRuntime::_new(JavaThread* thread, ConstantPool* pool, int index))
  //...
  oop obj = klass->allocate_instance(CHECK);
  thread->set_vm_result(obj);
IRT_END

// instanceKlass.cpp
instanceOop InstanceKlass::allocate_instance(TRAPS) {
  //...
  i = (instanceOop)Universe::heap()->obj_allocate(this, size, CHECK_NULL);
  if (has_finalizer_flag && !RegisterFinalizersAtInit) {
    i = register_finalizer(i, CHECK_NULL);
  }
  return i;
}

// collectedHeap.cpp
oop CollectedHeap::obj_allocate(Klass* klass, int size, TRAPS) {
  ObjAllocator allocator(klass, size, THREAD);
  return allocator.allocate();
}

// memAllocator.cpp
oop MemAllocator::allocate() const {
  oop obj = NULL;
  {
    Allocation allocation(*this, &obj);
    HeapWord* mem = mem_allocate(allocation);
    if (mem != NULL) {
      obj = initialize(mem);
    }
  }
  return obj;
}

HeapWord* MemAllocator::mem_allocate(Allocation& allocation) const {
  if (UseTLAB) {
    HeapWord* result = allocate_inside_tlab(allocation);
    if (result != NULL) {
      return result;
    }
  }

  return allocate_outside_tlab(allocation);
}

HeapWord* MemAllocator::allocate_outside_tlab(Allocation& allocation) const {
  allocation._allocated_outside_tlab = true;
  HeapWord* mem = _heap->mem_allocate(_word_size, &allocation._overhead_limit_exceeded);
  if (mem == NULL) {
    return mem;
  }

  NOT_PRODUCT(_heap->check_for_non_bad_heap_word_value(mem, _word_size));
  size_t size_in_bytes = _word_size * HeapWordSize;
  _thread->incr_allocated_bytes(size_in_bytes);

  return mem;
}

// g1CollectedHeap.cpp
HeapWord*
G1CollectedHeap::mem_allocate(size_t word_size,
                              bool*  gc_overhead_limit_was_exceeded) {
  assert_heap_not_locked_and_not_at_safepoint();

  if (is_humongous(word_size)) {
    return attempt_allocation_humongous(word_size);
  }
  size_t dummy = 0;
  return attempt_allocation(word_size, word_size, &dummy);
}

inline HeapWord* G1CollectedHeap::attempt_allocation(size_t min_word_size,
                                                     size_t desired_word_size,
                                                     size_t* actual_word_size) {
  //...
  HeapWord* result = _allocator->attempt_allocation(min_word_size, desired_word_size, actual_word_size);

  if (result == NULL) {
    *actual_word_size = desired_word_size;
    result = attempt_allocation_slow(desired_word_size);
  }
  //...
}

整个代码不复杂,就是跳来跳去的,我上面是按照调用顺序整理的,最终走到 attempt_allocation_slow方法,里面就有触发 GC 相关的代码了:

// g1CollectedHeap.cpp
HeapWord* G1CollectedHeap::attempt_allocation_slow(size_t word_size) {
    //...
    if (should_try_gc) {
      bool succeeded;
      result = do_collection_pause(word_size, gc_count_before, &succeeded,
                                   GCCause::_g1_inc_collection_pause);
      //...
    }
    //...
}

HeapWord* G1CollectedHeap::do_collection_pause(size_t word_size,
                                               uint gc_count_before,
                                               bool* succeeded,
                                               GCCause::Cause gc_cause) {
  assert_heap_not_locked_and_not_at_safepoint();
  VM_G1CollectForAllocation op(word_size,
                               gc_count_before,
                               gc_cause,
                               false, /* should_initiate_conc_mark */
                               g1_policy()->max_pause_time_ms());
  VMThread::execute(&op);

  HeapWord* result = op.result();
  bool ret_succeeded = op.prologue_succeeded() && op.pause_succeeded();
  assert(result == NULL || ret_succeeded,
         "the result should be NULL if the VM did not succeed");
  *succeeded = ret_succeeded;

  assert_heap_not_locked();
  return result;
}

可以发现 GC 是通过在 VMThread 里执行对应的 VM_Operation 实现的,具体逻辑只要去看 VM_G1CollectForAllocation的实现就可以了。

即时编译

同样的,我们也不会涉及具体编译的逻辑,只是梳理下即时编译触发的流程。

我们知道,JVM 是基于方法被调用次数以及循环执行的次数来决策是否进行即时编译的。所以我们可以去看invokevirtual字节码的实现,相关代码如下:

// templateTable_x86.cpp
void TemplateTable::invokevirtual(int byte_no) {
  transition(vtos, vtos);
  assert(byte_no == f2_byte, "use this argument");
  prepare_invoke(byte_no,
                 rbx,    // method or vtable index
                 noreg,  // unused itable index
                 rcx, rdx); // recv, flags

  // rbx: index
  // rcx: receiver
  // rdx: flags

  invokevirtual_helper(rbx, rcx, rdx);
}
void TemplateTable::invokevirtual_helper(Register index,
                                         Register recv,
                                         Register flags) {
  //...
  __ jump_from_interpreted(method, rdx);
}

// interp_masm_x86.cpp
void InterpreterMacroAssembler::jump_from_interpreted(Register method, Register temp) {
  //...
  jmp(Address(method, Method::from_interpreted_offset()));
}

// method.hpp
static ByteSize from_interpreted_offset()      { return byte_offset_of(Method, _from_interpreted_entry ); }

可以看到,最终又调用到了我们前面介绍的 TemplateInterpreterGenerator::generate_normal_entry生成的代码里,可以看到和方法计数相关的代码:

// templateInterpreterGenerator_x86.cpp
address TemplateInterpreterGenerator::generate_normal_entry(bool synchronized) {
  //...
  // increment invocation count & check for overflow
  Label invocation_counter_overflow;
  Label profile_method;
  Label profile_method_continue;
  if (inc_counter) {
    generate_counter_incr(&invocation_counter_overflow,
                          &profile_method,
                          &profile_method_continue);
    if (ProfileInterpreter) {
      __ bind(profile_method_continue);
    }
  }
  //...
  // invocation counter overflow
  if (inc_counter) {
    if (ProfileInterpreter) {
      // We have decided to profile this method in the interpreter
      __ bind(profile_method);
      __ call_VM(noreg, CAST_FROM_FN_PTR(address, InterpreterRuntime::profile_method));
      __ set_method_data_pointer_for_bcp();
      __ get_method(rbx);
      __ jmp(profile_method_continue);
    }
    // Handle overflow of counter and compile method
    __ bind(invocation_counter_overflow);
    generate_counter_overflow(continue_after_compile);
  }
  return entry_point;
}

void TemplateInterpreterGenerator::generate_counter_overflow(Label& do_continue) {
  //...
  __ call_VM(noreg,
             CAST_FROM_FN_PTR(address,
                              InterpreterRuntime::frequency_counter_overflow),
             rarg);
  //...
}

//interpreterRuntime.cpp
nmethod* InterpreterRuntime::frequency_counter_overflow(JavaThread* thread, address branch_bcp) {
  nmethod* nm = frequency_counter_overflow_inner(thread, branch_bcp);
  //...
}

IRT_ENTRY(nmethod*,
          InterpreterRuntime::frequency_counter_overflow_inner(JavaThread* thread, address branch_bcp))
  //...
  nmethod* osr_nm = CompilationPolicy::policy()->event(method, method, branch_bci, bci, CompLevel_none, NULL, thread);
  //...
IRT_END

//simpleThresholdPolicy.cpp
nmethod* SimpleThresholdPolicy::event(const methodHandle& method, const methodHandle& inlinee,
                                      int branch_bci, int bci, CompLevel comp_level, CompiledMethod* nm, JavaThread* thread) {
  //...
  if (bci == InvocationEntryBci) {
    method_invocation_event(method, inlinee, comp_level, nm, thread);
  } else {
    // method == inlinee if the event originated in the main method
    method_back_branch_event(method, inlinee, bci, comp_level, nm, thread);
    // Check if event led to a higher level OSR compilation
    nmethod* osr_nm = inlinee->lookup_osr_nmethod_for(bci, comp_level, false);
    if (osr_nm != NULL && osr_nm->comp_level() > comp_level) {
      // Perform OSR with new nmethod
      return osr_nm;
    }
  }
  return NULL;
}

void SimpleThresholdPolicy::method_invocation_event(const methodHandle& mh, const methodHandle& imh,
                                                      CompLevel level, CompiledMethod* nm, JavaThread* thread) {
  if (should_create_mdo(mh(), level)) {
    create_mdo(mh, thread);
  }
  CompLevel next_level = call_event(mh(), level, thread);
  if (next_level != level) {
    if (maybe_switch_to_aot(mh, level, next_level, thread)) {
      // No JITting necessary
      return;
    }
    if (is_compilation_enabled() && !CompileBroker::compilation_is_in_queue(mh)) {
      compile(mh, InvocationEntryBci, next_level, thread);
    }
  }
}

void SimpleThresholdPolicy::compile(const methodHandle& mh, int bci, CompLevel level, JavaThread* thread) {
  //...
  if (!CompileBroker::compilation_is_in_queue(mh)) {
    if (PrintTieredEvents) {
      print_event(COMPILE, mh, mh, bci, level);
    }
    submit_compile(mh, bci, level, thread);
  }
}

void SimpleThresholdPolicy::submit_compile(const methodHandle& mh, int bci, CompLevel level, JavaThread* thread) {
  int hot_count = (bci == InvocationEntryBci) ? mh->invocation_count() : mh->backedge_count();
  update_rate(os::javaTimeMillis(), mh());
  CompileBroker::compile_method(mh, bci, level, mh, hot_count, CompileTask::Reason_Tiered, thread);
}

nmethod* CompileBroker::compile_method(const methodHandle& method, int osr_bci,
                                         int comp_level,
                                         const methodHandle& hot_method, int hot_count,
                                         CompileTask::CompileReason compile_reason,
                                         DirectiveSet* directive,
                                         Thread* THREAD) {
        //...
        compile_method_base(method, osr_bci, comp_level, hot_method, hot_count, compile_reason, is_blocking, THREAD);
        //...
}

void CompileBroker::compile_method_base(const methodHandle& method,
                                        int osr_bci,
                                        int comp_level,
                                        const methodHandle& hot_method,
                                        int hot_count,
                                        CompileTask::CompileReason compile_reason,
                                        bool blocking,
                                        Thread* thread) {
        //...
        task = create_compile_task(queue,
                               compile_id, method,
                               osr_bci, comp_level,
                               hot_method, hot_count, compile_reason,
                               blocking);
        //...
}

CompileTask* CompileBroker::create_compile_task(CompileQueue*       queue,
                                                int                 compile_id,
                                                const methodHandle& method,
                                                int                 osr_bci,
                                                int                 comp_level,
                                                const methodHandle& hot_method,
                                                int                 hot_count,
                                                CompileTask::CompileReason compile_reason,
                                                bool                blocking) {
  CompileTask* new_task = CompileTask::allocate();
  new_task->initialize(compile_id, method, osr_bci, comp_level,
                       hot_method, hot_count, compile_reason,
                       blocking);
  queue->add(new_task);
  return new_task;
}

按照上面的调用顺序一路跟下来,我们可以看到最终会生成一个编译任务并放到一个队列里,显然会有另外一个线程从这个队列里取任务来执行。通过下断点,关键字搜索,再结合我们前面介绍的 JVM 初始化过程,不难发现最终编译的线程的入口是在 thread.cpp里定义的:

// thread.cpp
static void compiler_thread_entry(JavaThread* thread, TRAPS) {
  assert(thread->is_Compiler_thread(), "must be compiler thread");
  CompileBroker::compiler_thread_loop();
}

CompileBroker::compiler_thread_loop()就会不停的从编译任务队列里取任务来执行:

// compileBroker.cpp
void CompileBroker::compiler_thread_loop() {
    //...
    // Compile the method.
    if ((UseCompiler || AlwaysCompileLoopMethods) && CompileBroker::should_compile_new_jobs()) {
        invoke_compiler_on_method(task);
        thread->start_idle_timer();
    } else {
        // After compilation is disabled, remove remaining methods from queue
        method->clear_queued_for_compilation();
        task->set_failure_reason("compilation is disabled");
    }
    //...
}

void CompileBroker::invoke_compiler_on_method(CompileTask* task) {
    //...
    comp->compile_method(&ci_env, target, osr_bci, directive);
    //...
}

//c1_Compiler.cpp
void Compiler::compile_method(ciEnv* env, ciMethod* method, int entry_bci, DirectiveSet* directive) {
    BufferBlob* buffer_blob = CompilerThread::current()->get_buffer_blob();
    assert(buffer_blob != NULL, "Must exist");
    // invoke compilation
    {
        // We are nested here because we need for the destructor
        // of Compilation to occur before we release the any
        // competing compiler thread
        ResourceMark rm;
        Compilation c(this, env, method, entry_bci, buffer_blob, directive);
    }
}

//c1_Compilation.cpp
Compilation::Compilation(AbstractCompiler* compiler, ciEnv* env, ciMethod* method,
int osr_bci, BufferBlob* buffer_blob, DirectiveSet* directive)
//...
{
    //...
    compile_method();
    //...
}

void Compilation::compile_method() {
    //...
    if (InstallMethods) {
        // install code
        PhaseTraceTime timeit(_t_codeinstall);
        install_code(frame_size);
    }
    //...
}

void Compilation::install_code(int frame_size) {
    // frame_size is in 32-bit words so adjust it intptr_t words
    assert(frame_size == frame_map()->framesize(), "must match");
    assert(in_bytes(frame_map()->framesize_in_bytes()) % sizeof(intptr_t) == 0, "must be at least pointer aligned");
    _env->register_method(
        method(),
        osr_bci(),
        &_offsets,
        in_bytes(_frame_map->sp_offset_for_orig_pc()),
        code(),
        in_bytes(frame_map()->framesize_in_bytes()) / sizeof(intptr_t),
        debug_info_recorder()->_oopmaps,
        exception_handler_table(),
        implicit_exception_table(),
        compiler(),
        has_unsafe_access(),
        SharedRuntime::is_wide_vector(max_vector_size())
        );
}

// ciEnv.cpp
void ciEnv::register_method(ciMethod* target,
                            int entry_bci,
                            CodeOffsets* offsets,
                            int orig_pc_offset,
                            CodeBuffer* code_buffer,
                            int frame_words,
                            OopMapSet* oop_map_set,
                            ExceptionHandlerTable* handler_table,
                            ImplicitExceptionTable* inc_table,
                            AbstractCompiler* compiler,
                            bool has_unsafe_access,
                            bool has_wide_vectors,
                            RTMState  rtm_state) {
  //...
  // Allow the code to be executed
  method->set_code(method, nm);
  //...
}


// method.cpp
// Install compiled code.  Instantly it can execute.
void Method::set_code(const methodHandle& mh, CompiledMethod *code) {
  //...
  // Instantly compiled code can execute.
  if (!mh->is_method_handle_intrinsic())
    mh->_from_interpreted_entry = mh->get_i2c_entry();
}

编译完成后会将我们前面介绍的method_from_interpreted_entry修改到编译过的代码入口,这样下次方法执行的时候就会走即时编译的代码了。