# Implementing LC-3 in Fortran

I’m writing a virtual machine for the Little Computer 3, an educational computer from the textbook of Patt & Patel (2003) [1]. It’s a great way to learn about computer architecture, hence I decided to make a tutorial-like post if any of you are interested in joining along. Feel free to ask questions or brainstorm the best way to implement the virtual machine.

## What is LC-3?

LC-3 is a register machine with a word size of 16-bits and a 2^16-location address space (131 KB of memory). It has eight general purpose registers named R0-R7, a program counter, and three status registers.

The full instruction set architecture (ISA) description can be found in [2]. You can find additional resources with your favorite search engine.

### Memory

The “virtualized” memory in Fortran becomes:

``````integer(int16) :: mem(0:65535) ! memory

integer(int16) :: R(0:7) ! registers
integer :: PC            ! program counter
logical :: N, Z, P       ! status registers
``````

Some of the decisions to make here are:

• How to represent the memory? Should it be an array of `int16` or an array of `character(len=2)`?
• What about the registers? The `int16` has the benefit the arithmetic operations and bit manipulation functions are immediately available. On the other hand, integer wraparound is not guaranteed, so a compiler option like gcc’s `-fwrapv` might be needed.
• How to represent the status registers for conditional branching? As three logical variables (1-bit of information each) or simply as an `integer` mask?

(For simplicity, let’s assume we are on a two-complement machine, and the byte size is 8-bits. If you want to make it more challenging, feel free to lift this assumption.)

Originally, the program counter is supposed to be stored in (special-purpose) 16-bit register. Since Fortran only has signed integers, using `int16` would complicate memory addressing, so we may as well store it in an integer type with larger range.

### Instruction Set

The instruction set is very lean with only 15 instructions. There are no dedicated subtract, multiply, or divide instructions. Instead a subtraction can be performed by negating the operand and addition. Multiplication can be performed by a sequence of additions. For floating point it is necessary to write a floating-point library.

Each instruction is composed of an opcode (what the computer should do) and its operands. Some of the instructions support different addressing modes (where the data is located, registers, immediate data, or memory).

The LC3 Instruction Set [1]:

The instructions can be categorized into three groups:

• Operate instructions (ADD, AND, NOT, LEA*); these process data
• Data movement instructions (LD,LDR,LDI,ST,STR,STI); used to load and store data, moving it between registers and memory
• Control instructions (BR, JMP, JSR, TRAP, RTI); used to change the sequence of operations

The (virtual) arithmethic-logic unit takes an opcode (the top 4 bits) and operands, and performs the operation. In Fortran this becomes one of the following:

• a `select case` construct
• a table of procedure pointers
• a computed `go to` statement
``````instr = mem(PC)               ! Fetch instruction from memory
opcode = ibits(instr,12,4)    ! Extract opcode

select case(opcode)
case(AND)
... perform bitwise-AND ...
case(...)
...
end select
``````
``````type :: instr
procedure(lc3_instr), pointer, nopass :: p
end type

type(instr), parameter :: ops(16) = [ instr(ADD), instr(AND), ...]

call ops(opcode)%p()
``````
``````    go to (10, 20, 30, ...) opcode
...
20  continue ! AND
...
30 continue  ! ...
...
``````

The `select case` construct is the easiest to start with. The other two options could be useful in an advanced setting where we implement software pipelining (a technique for implementing instruction-level parallelism) so that the execution of current opcode and decoding of the next instruction overlap.

After jumping to the right opcode branch we simply emulate the instruction using the built-in Fortran statements, operators, subroutines, and intrinsic functions. The definitions of the instructions can be found in [2]. Note that two instruction opcodes can be ignored (RTI - return from interrupt, and the reserved opcode `1101`) for the start.

### Fetch-execute cycle

The “heart” of the virtual machine is the instruction cycle (also known as fetch-decode-execute or simply fetch-execute cycle). After booting-up (in our case launching the virtual machine program), the computer enters the central processing cycle:

``````do
! 1) Fetch instruction from memory
!    ...

! 2) Decode operation
!    ...

! 3) Execute the instruction (ALU)
!    ...

end do
``````

For practical reasons the steps 2 and 3 can be overlapped in the sense that we do decoding within the `select case` statement.

### Putting it all together

The skeleton of the virtual machine is now,

``````program lc3
use, intrinsic :: iso_fortran_env, only: int16
implicit none

integer(int16) :: mem(0:65535) ! memory
integer(int16) :: R(0:7) ! registers
integer :: PC ! program counter
logical :: N, Z, P ! status registers

! TODO: Boot the virtual machine
!       1) Read user program into memory
!       2) Initialize program counter and other registers

do

!  TODO: Fetch instruction and decode
instr = ... fetch instruction from memory ...
opcode = ... extract opcode (upper 4 bits) ...
PC = PC + 1 ! Adjust program counter

!  TODO: Implement instructions (and decoding)
select case(opcode)
case(...)
...
end select

end do

end program
``````

The program is to be invoked from the shell as:

``````\$ lc3 path/to/object-file
``````

where the object file contains the instructions stored in big-endian (!) format.

Besides the textbook [1], you can help yourself with the following tutorials:

### Writing programs for LC-3

#### Assembly

In principle you could assemble programs manually, but this is very laborious. Software tools (assemblers and a C-compiler) can be found on the webpage [3] accompanying the Patt & Patel (2003) textbook. They are a somewhat outdated, but overhauled versions can be found on GitHub, for example lc3tools.

An alternative I’ve found is a Rust program called customasm which allows you to define your own assembly language. There is also an online version of it: customasm/web. The syntax of customasm is described here.

I have done my best to define the LC-3 instruction set in customasm here: LC-3 for customasm, but I don’t guarantee it is without errors. You can copy-paste it into the customasm web app, or save it in a file, which is then included in your assembly program, for example:

``````#include "LC-3.asm"

; Program to multiply an integer by the constant 6.
; Before execution, an integer must be stored in NUMBER
LD  R1,SIX
LD  R2,NUMBER
AND R3,R3,0

; The inner loop
;
AGAIN:
BRp AGAIN
;
HALT
;

NUMBER:
#res 1      ; Reserve one word for external input
SIX:
#d 0x0006   ; SIX = 6
``````

Customasm can output the instructions and data in different formats. Here is the annotated binary format,

`````` outp | addr | data (base 2)

0:0 | 3000 | 00100010 00000111 ; LD  R1,SIX
2:0 | 3001 | 00100100 00000101 ; LD  R2,NUMBER
4:0 | 3002 | 01010110 11100000 ; AND R3,R3,0
6:0 | 3003 |                   ; AGAIN:
6:0 | 3003 | 00010110 11000010 ; ADD R3,R3,R2
8:0 | 3004 | 00010010 01111111 ; ADD R1,R1,-1
a:0 | 3005 | 00000011 11111101 ; BRp AGAIN
c:0 | 3006 | 11110000 00100101 ; HALT
e:0 | 3007 |                   ; NUMBER:
10:0 | 3008 |                   ; SIX:
10:0 | 3008 | 00000000 00000110 ; 0x0006
``````

As a hex string this would become:

``````2207240556e016c2127f03fdf02500000006
``````

(Each hex number corresponds to 4-bit binary number, so 4 hex numbers are one instruction.)

#### Higher-level programming languages

A C to LC-3 compiler named `lcc` is available as part of the student resources [3] that accompany the textbook. A few different copies of it can be found on GitHub too. I managed to get this one working: GitHub - haplesshero13/lcc-lc3: LC3 C Compiler for the Little Computer v3, but a few people commented elsewhere it can break easily.

In principle one could use f2c, to transpile a Fortran program to C and then compile it for LC-3. But I doubt this will work straight out of the box. It’s very likely that parts of `f2c` or `lcc` will need to be adapted first.

An alternative path would be to write an experimental code generation backend for LFortran (or any other compiler).

### References

[1] Patt, Yale N.; Patel, Sanjay (2004). Introduction to Computing Systems: From Bits and Gates to C and Beyond. New York, NY: McGraw-Hill Higher Education. (A third edition was released in 2020, including C++)
[2] The LC-3 ISA, Excerpt from the Patt & Patel (2004) book, https://www.jmeiners.com/lc3-vm/supplies/lc3-isa.pdf (PDF, 349 KB)
[3] Student Resources: Introduction to Computing Systems

19 Likes

Looks like a very nice summer project.

In fact, I might give it a try myself…

1 Like

You can define memory and register to be int32 and then forget the higher bits, at the end it is just an emulator.

Or you can define your memory as:

``````integer(int16) :: memory(-128:127)
``````

And then hope that your computer uses 2-complements to implement negative integer.
Nice project!

It would be nice if you try to implement the fetch decode and execute using parallel facility in Fortran, like coarrays etc.

That would be supernice!

I suppose you could do that, but it would be kind of inefficient to waste half the memory. In principle we could also pack two words into an `int32` and then introduce functions for reading and writing individual words:

``````integer(int32) :: mem(32768)

! ...

integer, intent(in) :: i ! An address between z'0000' and z'FFFF' (inclusive)
integer(word) :: memread ! Output of range -32768 to 32767
integer :: j

! | 0 | 1 | 2 | 3 | 4 | 5 | ... | - Memory address
! |   1   |   2   |   3   | ... | - Fortran array index

j = i / 2 + 1
if (mod(i,2) == 0) then
In a Fortran implementation which doesn’t support `int16` it may be the only way to do it, besides using `character(len=_)` for storage.