An intermediate code or bytecode is a programming language that serves as a bridge between a high-level programming language and the machine code read by microprocessors in computers. This intermediate code is a translation of the high-level language, which an interpreter then translates back into a machine language that the computer hardware can process.
A well-known example of an intermediate code is the so-called three-way code, where each instruction in this code operates on three elements: two operands and a result.
Why do we need an intermediate code? Mainly for practical compatibility reasons. The intermediate code is generated independently of any technical requirements of the hardware or platform on which it is to be translated, so it can be converted and adapted to different operating systems and computer systems.
How does the intermediate code work?
Let's imagine that we want to program a computer application in Python. At some point, we will have to transform the Python code into an intermediate language or bytecode. So, this would be the process for generating intermediate code:
High-level code generation
A developer creates software programmed with Python, or any high-level programming language that other developers can read. In many cases, developers write the code of the programs in integrated development environments which have a wide range of tools at their disposal. However, in this environment the code cannot be translated to a hardware-readable language.
Compilation to intermediate code
A compiler is the tool par excellence that converts source code to intermediate code. This program translates strings from source code to intermediate code, or to machine code or any other programming language by means of lexical, syntactic and semantic analysis.
Final machine code translation
An interpreter, a virtual machine, is installed on all computer systems where the application is to be tested. It will do the final translation of intermediate code to machine code composed of binary numbers (1 or 0) so that computer processors can process the final code and execute it.
It is worth mentioning that the virtual machine that serves as the interpreter is not interchangeable between different operating systems or computer architectures, Each one requires an interpreter with very specific characteristics.
Programming languages using intermediate code
Several programming languages use intermediate code as part of their compilation or interpretation process. Some of these are:
-
Java: Java source code is compiled into bytecode and then interpreted into machine code by the Java Virtual Machine.
-
C#: The source code of C# is compiled in CIL (Common Intermediate Language), also known as MSIL (Microsoft Intermediate Language). The Common Language Runtime runs the code, the .NET virtual machine.
-
JavaScript: Particularly in advanced engines such as V8 in Google Chrome and Node.js, JavaScript also compiles the source code into bytecode.
Example of intermediate language
This bytecode is generated in Javac, the Javay compiler, and is executable in a Java Virtual Machine (JVM).
First, let's create a simple Java program ad hoc for the example at hand:
public class HelloWorld {
public static void main(String[] args) {
System.out.println(«Hello, World!»);
}
}
When this code is compiled using the javac command HelloWorld.java, a bytecode file called HelloWorld.class is generated. The content of this file is not oriented towards human readability, but can be decompiled with tools such as javap (included in the JDK).
If we run javap -c HelloWorld.class, we will get a representation of the bytecode in a more readable format. This bytecode is an intermediate representation between the high-level source code (Java) and the platform-specific machine code. Subsequently, the virtual machine will interpret it according to the specific hardware for which it has to translate it:
Compiled from «HolaMundo.java».»
public class HelloWorld {
public HelloWorld();
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object.»»:()V
4: return
public static void main(java.lang.String[]);
Code:
0: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream;
3: ldc #3 // String Hello, World!
5: invokevirtual #4 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
8: return
}



