THE
UNIVERSITY OF TEESSIDE
SCHOOL
OF COMPUTING AND MATHEMATICS
TS1
3BA
USER
EXTENSIBLE SYSTEM ARCHITECTURE (UESA)
BSc
Computer Science
April
2000
I P
Chapman
Supervisor:
A. Clements
Second Reader: B. Stoddart
The Central Processing Unit (CPU) of a computer is a piece of hardware that is designed to execute programs. At run-time programs are loaded into memory whereupon the processor sequentially reads the instructions and acts upon their interpretation. Almost all CPUs in use today have a fixed architecture, which includes a set of registers and a fixed instruction set. This report describes a system whereby programs can be executed under a user-defined architecture.
Java was chosen as the programming language because of its portability. The resulting software would run on any platform that had a Java Runtime Environment (JRE). As the User Extensible System Architecture (UESA) allows the user to execute programs contained in ASCII source code, it was necessary to write the UESA as a Java application. Java applets, as opposed to applications, run on the web and do not allow the program to access to files contained on the computer. A Java applet is therefore considered unsuitable for this project. The UESA also takes advantage of Java’s powerful Graphical User Interface (GUI) classes called Swing.
The greatest difficulty in developing this application was learning to program in Java. I had only ever used more traditional, procedural based languages such as C, so a new way of thinking was required when learning an object-orientated language.
I would like to thank my project supervisor, Alan Clements for his help and guidance throughout the project. I would also like to thank my friend, Darren Gray for his useful comments on this report.
3. The Central Processing Unit.................................................................... 6
5.4 The Graphical User Interface (GUI)..................................................... 30
6. Implementation and Testing.................................................................. 36
8. Conclusions and Recommendations.................................................. 41
Appendix
A – Project Specification
Appendix
B - Augmented Backus Naur Form
Appendix
C - Sample Prototype Source Code
Appendix
D – Flowchart Notation
Appendix
E – Initial Memory and Register Designs
Appendix
F – UML Class Diagrams
Appendix
G – HTML User Documentation
Appendix
H – User Interface Designs.
Appendix
I – Assembly Test Programs.
Appendix
J – Sample Source Code
A Central Processing Unit has a fixed architecture comprised of several basic components. These components interact with program instructions during execution and reflect on the processor’s final speed and efficiency. In general these components cannot be changed once the processor has been manufactured in silicon.
A virtual processor with a user-definable architecture has the potential to allow the user to dry-run software under a specific architecture to see how it performs and operates. By making changes to the architecture of a processor, one may also see how this affects the operation of the software. A configurable processor opens up the possibility of emulating a multitude of real processors within a single package.
There are many parts of a processor, which reflect on its performance, however the UESA will concentrate on three main areas. The processor’s register set, which can be a composition of both data and instructions registers, the main store and the actual instruction set used by the processor.
This report describes an application which allows the user to define several aspects of a processor. The target audience of such an application may be those who are interested in testing how processor components affect program performance, those who are interested in the potential of emulating other architectures or those who wish to create their own.
Before working on this project, I had no previous experience with an object-orientated language such as Java, so it was also an opportunity for me to gain experience and knowledge in this area.
The original project specification is attached in Appendix A.
The background research for this report involved studying Java and understanding the basic concepts of a CPU. This report describes the product of that research and the work involved from analysis to testing.
Chapter 2 summarises the methodologies that were used in the creation of the software.
Chapter 3 describes the basics of a CPU and introduces the reader to the CPU terminology used throughout the report.
Chapter 4 discusses the suitability of Java as a programming language for this project and provides a brief history of Java and object orientated programming.
Chapter 5 describes the analysis and design of the project including the design of the CPU components, the graphical user interface (GUI) and the design and layout of assembly source code.
Chapter 6 discusses the implementation of the system and the methods used for testing.
Chapter 7 evaluates both the software and project, with the conclusions and recommendations being described in Chapter 8.
This chapter describes the methodologies used from analysis to implementation. Section 2.1 describes why certain limitations have been imposed and section 2.2 describes the methodologies.
It was important at the start of the project to lay down which aspects of the UESA should be configurable by the user and what limitations should be imposed. It was necessary to impose certain limitations for practical reasons and time constraints.
A major component of a processor’s architecture is its register set. The limitations on the register set would be based on common processor designs available today. For example, the Motorola 68000 series of processors have both general purpose data registers and address registers available to the programmer [7] whereas the ARM series of processors have no distinction between data and registers [6]. It was decided that it was important for the user to be able to define both Address and Data Registers. The next step was in deciding what the maximum number of registers should be. Again, this was based on real processors in use today. Both the ARM and 68000 series of processors have 16 general-purpose registers whereas PowerPC processors have 32 [9].
The CPU’s instruction set is another important factor, which obviously varies considerably from one architecture to another. The instruction set should be definable by the user, however the difficulty was in deciding how the instructions could be defined. Register Transfer Language (RTL) is a common method for describing data movement between a processor’s registers and memory. Initially it seemed quite appropriate for describing an instruction. The mnemonic would be chosen by the user and an RTL sequence would define its operation. It became clear however that the user would have to be limited to a generic subset of RTL. Specific sequences would lead to some rather obscure assembly.
Java was chosen as a suitable programming language, which I began to explore by means of working through textbooks and attempting example programs and exercises [3], [1].
The design approach involved three main steps.
Step one involved using a technical drawing package called Visio, which allows the user to visually describe software in a choice of notations. Diagrams were constructed in Visio to describe how different aspects of the UESA work and operated. Basic diagrams were drawn to show how registers and memory locations might be represented and flowcharts were constructed to describe the processes involved. This is particularly applicable in describing the routine for parsing an ASCII assembly source code file. The appearance of the graphical user interface was also designed visually with Visio using standard interface components. Although the source code for the GUI could be generated automatically from the designs, this method was not chosen, but instead the GUI was written manually. It was felt that in order to understand the GUI classes of Java correctly, it would be necessary to write the code manually.
Augmented Backus Naur Form (BNF) is a notation used for describing the structure of a programming language. This was deemed suitable for describing how the assembly source code should be written and in turn for describing what the parsing routine would expect. A description of the Augmented BNF notation is attached in Appendix B.
Step two dealt with the translation of the designs into prototype source code. Prototype source code is necessary because it allows the programmer to test the robustness of their designs, without the complications of the entire project and its code clouding the situation. It was also appropriate because of the uncertainties of how to implement the software due to the unfamiliarity of Java. At the same time, errors in the design or limitations within the programming language can be detected and the design re-worked if necessary. Prototype source code was developed on a class-by-class basis, with linking of the classes being done in step three. Samples of some of the prototype source code are attached in Appendix C.
Although Unified Modelling Language (UML) was not used to design the classes of the UESA, it is a useful language for describing the structure of the classes.
Step Three dealt with the final implementation of the prototype source code. Once the prototype class was proven to work satisfactorily, it was integrated into the main source code. Regular backups of the entire source code were made to a fileserver whenever new source code was implemented, or existing source code changed.
The Central Processing Unit (CPU) is a piece of hardware, often incorporated onto a single chip or integrated circuit (IC). The CPU is responsible for executing instructions that form the basis of programs. Instructions are read sequentially from memory, decoded and executed. The CPU is often responsible for communicating with external devices called peripherals. There are technical differences between the terms CPU and processor, however in modern day use they have become almost synonymous.
There are many components of a CPU, all of which function together in order to execute programs and process data. This section will concentrate on those components which the UESA implements.
Although in the strictest of terms the main store memory is usually not part of the CPU, it is closely related to its operation. Any CPU would have little practical functionality without it. Main store memory is responsible for storing the instructions and data of programs awaiting processing and execution. Each instruction or item of data is stored in a specific location or across several locations, which are identified by unique addresses. The more memory a processor has at its disposal, the greater the size and number of programs which it can execute.
Registers are simply small units of memory commonly around 32bits in size, integrated within the CPU. Processors often contain different registers designed for specific purposes, some of which may be manipulated by the programmer and others that are designed for the processor's internal use.
The UESA is concerned with two main types of general-purpose registers, both of which can be manipulated by the programmer. These are data registers and address registers. Their primary use is for the fast storage and retrieval of temporary data during program execution to help avoid the need to access the much slower main memory. Address Registers differ slightly in the fact they are designed to hold an address, or in other words a pointer to a memory location containing an item of data.
3.2.3 Instructions and Operands
All processors have an instruction set that forms the basis of programs executed by the processor. They are stored as a series of bits that can be broken down into two main fields.
The operation code is decoded by the processor and signifies the instruction to perform. Some processors have a fixed length operation code, which means all instructions are defined using the same number of bits; others use a variable length definition.
The second component of an instruction describes one or more operands. These are simply items of data, which are operated on by the instruction. It is important however to understand, the processor simply sees a sequence of bits. It is the interpretation of those bits and the context of the instruction, which describes whether that value represents an integer, memory location, or even a register.

Figure 3.1 – Diagram of a sequence of bits representing a theoretical instruction
Although this method of describing instructions and their operands is perfectly suitable for the processor, it is difficult and convoluted for the programmer to use and understand. In modern computer use the programmer rarely deals with the bits directly, but adopts a more abstract notation that is closer to English. This is called assembly language.
Each instruction is identified by a name, called the mnemonic, e.g. MOVE and a sequence of characters that describe the operands (see Table 3.2). Assembly code usually contains special commands called assembler directives. They are not so much concerned with the machine code but instruct the assembler to perform tasks such as attaching symbolic names to variables, constants or even locations within the source code. The assembler, which is itself a program, is responsible for translating the assembly code into the sequence of bits understood by the processor.
Assembly source code is not portable to processors from differing families, because it is simply an “English” representation of the machine code instructions, which vary considerably across different processor families.
Table 3.1 – An example of 68000 assembly source code
Instruction |
Description |
|
MOVE.L D3,D0 |
Moves the 32bit value contained in Data Register 3, to Data Register 0. |
|
NOP |
No Operation. Contains no operands and instructs the processor to do nothing. |
|
AND.L D1,D0 |
Performs a 32bit logical AND operation on Data Register 1 and 0 with the result being stored in D0 |
The condition code register (CCR) sometimes called the program status register (PSR), contains a set of flags that record the outcome of an operation performed by an instruction. The exact flags that form the CCR vary across processors. The following are examples of the flags used by the Motorola 68000 processor [8].
The flags are immediately set after instruction execution, and may be used by other instructions to alter the outcome of their operation. For example, branching instructions can use the Z flag to determine whether the branch should take place or not. This is analogous to an IF statement in a higher level language.
The program counter is a special register that holds the address in main memory of the next instruction to be fetched. Once the instruction has been fetched, the program counter is incremented and the instruction is executed. Where an instruction turns out to be a branch and the branch is indicated to take place, the program counter is loaded with the address of the instruction to which the branch points.
The UESA does not use a program counter because instructions are not read from main memory. They are executed by directly interpreting the assembly source code, therefore a program counter is considered unnecessary. Instead, branches take place by specifying the absolute line number from where execution should continue.
This chapter provides a brief description of Java; its history and the concepts that led to its creation.
Java goes back to 1991 where a group of software engineers working for Sun Microsystems, wanted to design a language for use in consumer electronic devices [4]. Anything from videocassette recorders to cable TV switch boxes. Since these devices have little memory and frequently use a whole host of different processors, a language was required which was portable and produced tight code. They achieved this by designing a language, which compiled source code into byte code for a hypothetical machine. This machine is known as the Java Virtual Machine, or JVM for short. This intermediate code could then be run on any machine that had an appropriate interpreter.
Java exploded onto the scene in 1995 with the production of a web browser, now known as HotJava. Since then, its popularity has increased, especially in the production of glue software known as Middleware and Internet applications.
Java is a fully object-orientated language, which bares many similarities to C++. However, Java is designed to be far more robust and simpler to use. Almost everything in Java is a class including many of the data types such as strings, which are in fact simply classes wrapped around the character primitive. Every class in Java originates from the super class called Object. Java does not use pointers like more traditional languages such as C, but reference variables are used to refer to structures and objects. Methods, which are synonymous to functions or procedures in other languages, must be contained within a class.
In most languages the source code is compiled directly into a binary file, which can be executed only on the target platform. The binary file contains machine code instructions that are unique to the processor and function calls that are unique to the operating environment. Java differs significantly in this respect.
Java compiles source code into a form known as byte code for a virtual processor, with what could be considered as running a virtual operating system. The byte code is read into memory and interpreted by the Java Runtime Environment. The JRE effectively translates the byte code into a series of instructions understood by the native processor and operating system. In effect, any operating system that has a JRE is able to execute the Java “binaries” and no recompilation is necessary.
Software written in Java can either be designed to run as an application or an applet. An application generally runs on a user’s system, much the same as any other program. An applet is usually received from a remote server and executed, often as part of a web page. Certain restrictions are placed on applets for security reasons. These restrictions are enforced by the JRE and are not a programming convention. For example, applets cannot open files contained on the user’s machine.
This chapter describes the analysis and design of the final version of the User Extensible System Architecture. The design of the system was closely linked with its implementation as described in Chapter 6.
This section describes the operation and layout of the classes, which form the CPU components of the UESA.
The specification for the project stated that the size or width of the registers should be user-definable as either 16- or 32bit. In light of this, a register definition was required which could cope with different data sizes depending upon the users choice. The initial register design consisted of using an array of four bytes to represent one register. The number of registers would be set by creating an array whereby each element consisted of an array of four bytes, as mentioned. This effectively creates a two dimensional array, with one dimension holding the number of registers and the other holding the size of the registers. The thinking behind this was that it would allow 8-, 16- and 32bit register operations to be performed. The diagrams showing the original designs for memory and registers is attached in Appendix E.
Prototype source code was written to demonstrate how this method operates, by manipulating the individual bytes of each register. However, it soon became clear that this design had a major flaw. For example, putting a 32bit value into a register would require splitting the value into four bytes, requiring a complex set of calculations. In turn, these calculations would have to be performed multiple times for each instruction. The prototype source code is attached in Appendix C.
At this stage I began to explore object-orientated
programming in earnest and discovered it would be far more effective to
actually create a register class. The UML representation of this class is shown
in figure F.5 of Appendix F. The advantages of treating a register like an
object, as apposed to a primitive data type is that it can be accessed only by
using the specified methods.
This presents a standard interface to the register class in which the
underlying code of the methods can be changed without altering any other
classes. It also means that the internal workings of the register class cannot
be altered by malicious code. Another advantage is that the GUI component for
displaying the contents of a register can be contained within the register
class. This allows the GUI component to be automatically updated, whenever the
value of the register is changed.
Although address and data registers are formed from the same class, it is important to understand the difference between the two. The UESA treats data registers as a simple store, whereby their contents can be set or retrieved. Address registers are treated differently. Their contents are set in much the same way as that used for data registers, but they cannot be retrieved directly. Address registers actually hold a pointer to a memory location, so that any reference made to an address register, actually retrieves the contents of the memory location, pointed at by the register. This is analogous to address register indirect addressing as used on the 68000 [2]. Four public methods control the operation of the register.
The third advantage is that an array of register objects can be created to represent the entire number of registers chosen by the user.
The current register class only supports 32bit values due to time constraints, however an additional method could be inserted into this class and similar classes, to deal with 16bit values.
The initial design for representing the main store memory was much the same as that chosen for registers and based on the assumption that memory was byte aligned (see Appendix E).
Clearly, this would fall foul of the same problem concerning the conversion of data with differing bit lengths. In light of this, a memory class was constructed so that a single memory location could be treated as an object. The memory location class contains the following three public methods.
An array of memory location objects can be used to represent the entire main store memory available to the CPU. Methods to deal with GUI components are not required for memory locations, because it is physically impossible to display all memory locations within a single window.
In a real processor such as the 68000, the condition code register contains a series of bits whereby each bit represents one flag. As a side point, the CCR in the 68000 actually refers to a section of bits contained within the status register (SR). A similar approach was adopted for the CCR used by the UESA.
It was not necessary to design a class that represented a single flag, but rather the entire condition code register. The methods of the CCR class are responsible for changing the state of the individual flags. The CCR contains four flags based upon those found in the 68000. These are Carry (C), Zero (Z), Negative (N) and Overflow(V). The CCR class contains 17 methods that are summarised as follows. The “X” in the method names denotes the name of the flag i.e. C, Z, N, V
As with the registers, the GUI component for each flag is contained within the class. Each time the state of a flag is altered the GUI component is automatically updated to reflect those changes.
While code does exist to alter the state of the Overflow and Carry flags, the routines which would check for these occurrences have not been implemented. They have not been implemented, due to the uncertainty of how to check for the required states.
Instructions are instantiated by the instruction class, which contains three important pieces of information. Each instruction is formed from the mnemonic, its functions and its operands. Collectively, they make up an entire instruction definition. The mnemonic is simply held as a string, which is used for comparison against strings from the source code to identify an instruction.
The basic operation of any instruction can be broken down into a sequence of one or more basic functions (see section 5.3.3). For example, an instruction that adds two integers and leaves the result in a register can be represented by an “add” function followed by a “move” function. The instruction class holds an array of these “functions” which are processed in order when the instruction is encountered during the execution phase. The array itself is simply composed of constants. The value of the constant in an array element is a representation of the function to perform. During the execution phase, these constants are read from the instruction class and the relevant functions performed. See figure 5.1.

Figure 5.1 – Diagram to show the function array contained within the instruction class
The third point to consider is the operands associated with the instruction. These are contained within the instruction class in a similar manner to that used for the functions. An operand array also holds a series of constants that represent the data type of each operand (see figure 5.2). The operands are held in reverse order because a stack is used during the execution phase to hold the value of each operand, consequently the first value pulled from the stack is the value of the instruction’s last operand.

Figure 5.2 – Diagram to show the operand array contained within the instruction class
By reading the arrays shown above (figure 5.1 and 5.2) in the execution phase, we can define an instruction that performs addition on operands three and four. Then performs a subtraction with operand two and the result of the addition. Finally, the result is moved into the data register. For information on defining new instructions, please see section 5.3.2.
The instruction class contains six methods for instruction handling, as follows.
Source code layout was defined using augmented Backus Naur Form. The original BNF notation was devised by John Backus and Peter Naur and used to specify the language ALGOL 60 [5]. It was necessary to define how the source should be laid out because the parsing routine would need to process the source code based upon a set of rules. Figure 5.3 shows the augmented BNF definition. A description of the BNF notation used is attached in Appendix B.
Comment = “;” *OCTET
Mnemonic = 1*OCTET SP
Operand = *SP (“D”|”A”|”$”|”#”) 1*DIGIT *SP
Label = 1*(OCTET) “:”
Operandset = *(Operands “,”) Operand
Command = Mnemonic Operandset
Line = *SP (comment| Label | Command) EOL
Figure 5.3 – Augmented BNF definition of assembly source code.
This BNF definition can be summarised in English as follows.
The UESA executes programs by reading an ASCII source code file and processing it. A parsing routine was designed, based upon the BNF specification for the source code. The parsing routine sequentially reads lines from the source code and breaks them up into the relevant tokens. The parsing routine could be considered as a filter, whereby the tokens are filtered out at different stages for processing. The flowcharts for the design of the parsing routine are attached at the end of this section and a description of the flow chart notation is attached in Appendix D.
The parsing routine is also responsible for executing the instructions and setting the CPU’s initial state. The ParseSource class contains three public methods and 17 private methods. The private methods are only accessible by the ParseSource class for its own internal use. The main methods are summarised as follows.

Figure 5.4 – Flowchart to show overview of the parsing routine

Figure 5.5 Flowchart to show overview of the parseComment() method.

Figure 5.6 – Flowchart to show overview of the labelError() method.

Figure 5.7 – Flowchart to show overview of the parseDefault() method.

Figure 5.8 – Flowchart to show overview of the parseLabel() method.

Figure 5.9 – Flowchart to show overview of the parseCommand() method.

Figure 5.10 – Flowchart to show overview of the parseOperandset() method

Figure 5.11 – Flowchart to show overview of the parseOperand() method.

Figure 5.12 – Flowchart to show overview of the parseSpace() method.
As with any processor an instruction set is required in order to execute programs. The UESA does not have a built in instruction set as such (see the user documentation in Appendix G), but allows the user to define their own. This section describes the methods and notation used, by which instructions are defined and represented
In an ideal situation, the user would be able to define the instructions at the machine code level, however this is considered impractical for several reasons. Not only is machine code difficult to understand, but it is impossible to program in by all, except the most adept of programmers and hardware engineers. The UESA is more concerned with the functionality of those instructions and so it is preferable to allow the definition of assembly instructions.
A suitable notation is needed to describe an instruction, which both the user can understand and the software interpret. Register Transfer Language (RTL) is a common notation for describing the operation of an instruction and the manipulation of data [2]. In light of this, it was deemed a suitable notation to use for instruction definition.
On further analysis it became clear that RTL was not as appropriate as it first seemed. Whilst it is perfectly adequate for describing an instruction, in its current form it is not as useful for instruction definition. For example, an instruction in assembly language that moves a literal integer into a data register, may have the following form, ‘MOVE D0, #20’. The first operand is the destination and the second is the source. This may be described in RTL using the following sequence, ‘[D0]<-20’. If we wanted to do the reverse and create an instruction based upon that RTL sequence, we would have an instruction which took no parameters and simply moved the integer 20 into D0. No parameters would be needed because there would simply be nothing to specify. The operation is inherent within the mnemonic. Clearly, this would allow the user to define some extremely obscure source code, which is entirely unintelligible. A more generic definition was needed. The notation used by the UESA is shown table 5.1, and is based upon an adapted subset of RTL.
Table 5.1 – Adapted subset of RTL as used by the UESA
Construct |
Description |
|
[A] |
An address register |
|
[D] |
A data register |
|
[M] |
The contents of a memory location |
|
[#] |
An integer literal. |
|
dest <- source |
Move the source to the destination |
|
- |
Subtract |
|
+ |
Add |
|
/ |
Divide |
|
* |
Multiply |
The user defines an instruction by entering the mnemonic as a string, followed by an RTL sequence. This sequence is then parsed by the UESA and an instruction is defined. The method which handles the RTL sequence is contained within the instruction definition window class and is called parseRTL(). Each construct in a sequence of RTL can be considered as either a functional construct i.e. it describes an action, or an operand construct, for example a memory location or value. An example of a function construct might be ‘+’ which indicates an ‘add’ function whereas ‘[D]’ indicates an operand, in this case a data register. The instruction class requires four groups of information to be able to create a new instruction. These are summarised below.
Whilst parsing an RTL sequence, each time an operand or function is encountered the corresponding variable is incremented and an entry made in the corresponding array. The order in which the entries are placed in the arrays is important because it determines the order in which they are executed.
Certain types of instructions cannot be defined by specifying an RTL sequence, for example branching instructions. For cases such as these, the UESA provides a set of internal instructions that may be used to perform branching. Details of these instructions are described in the user documentation attached in Appendix G.
The user is able to define instructions which make use of the four basic arithmetic operators, in conjunction with several addressing modes. The UESA supports three types of addressing modes. These are register indirect addressing, as mentioned in section 5.1.1, literal addressing, and memory direct addressing. Literal addressing simply refers to an absolute value such as the number five and memory direct addressing refers to the contents of a memory location.
When defining the RTL sequence for an instruction, there are certain pitfalls which need to be observed. A definition such as ‘[D]<-[D]<-[#]’ creates an instruction that moves a literal into two data registers. In actual fact it means move an integer into a data register, then move the contents of that register into another data register. However the outcome is the same. Nevertheless, you might be forgiven for thinking that the sequence ‘[A]<-[A]<-[#]’ creates an instruction that moves a literal into two address registers. This is not the case. As mentioned in section 5.1.1 address registers hold pointers. So in this case, the literal would be moved into the first address register, but the second address register would have the contents of the memory location pointed at by the first, inserted instead.
During the parsing phase, all operand values associated with an instruction are placed on the stack in preparation for execution. The values on the stack are all integers, it is the responsibility of the execution phase to determine how to treat them.
As mentioned in section 5.1.3, an instruction definition holds two arrays, one contains a list of functions to perform and the other contains a list the operands’ data types. It is important at this stage for the reader to understand the difference between the operand values on the stack and operand data types held in the array. The stack holds the true values of an operand, which have been obtained from the source code. However, the operand array simply describes the data types of those values.
The execution phase determines the first function to perform, by looking at the function array. The next step is pull two values from the stack, for use by the function, however these cannot be used directly. An integer on the stack could be a literal, or be referring to a particular data register. A method called resolveValue() is used which looks at the operand array and determines the data type for that integer. In the case where the integer is found to be a literal, then resolveValue() simply returns it for use by the function however, in the case where the integer refers to a data register, then it is the contents of that data register which is returned. A similar approach is used for all the possible data types. Any result produced from a function is placed back on the stack and may be used by the next function. A boolean variable is used to indicate that a pending result is on the stack for the next function to use.
The functions that are used by the execution phase are summarised below. All instructions used by the UESA are formed from at least one, or more of these functions.
The user interacts with the system by using separate frames or windows. The windows in their current state are shown in this section, whilst the original designs upon which they were based are in Appendix H.
The main window (see figure 5.5) is the first interface to appear when the UESA is executed. The main window contains information about the current user’s settings. A menu system is available which implements event handling and has the following options.

Figure 5.4 – The main window GUI.
Figure 5.5 – The file requestor used for choosing the source code.

Figure 5.6 – The “Run Source” window.

Figure 5.7 – The “Memory Size Definition” window.

Figure 5.8 – The “Register Definition” window.

Figure 5.9 – The “Instruction Definition” window.

Figure 5.10 – The “About” window.
“Open Source File Requestor” (figure 5.5)
Displays a standard file requestor, which allows the user to select a file for processing by the UESA.
“Run Source Window” (figure
5.6)
All information regarding the execution of source code is displayed in the centre of the window, including the current line of source code being executed and any errors generated. The text fields down the sides of the window show the contents of the data and address registers respectively while the text fields in the lower half show the current status of the processor flags. The buttons on this window have the following functions.
“Define Memory Size” (figure 5.7)
Shows the current number of memory locations available to the CPU and provides a text field in which the user can enter a new value. Memory is shown in blocks because each location is currently 32bits wide.
“Define Registers” (figure 5.8)
Shows the current number of data and address registers. Text fields are provided for the user to change these values. The radio buttons were intended for the user to be able to change the size of the registers and in turn the size of a memory block, however this has not been implemented and so they are redundant. The “OK” buttons accepts any changes, whilst the close button discards any changes.
“Define Instructions” (figure 5.9)
This window handles the definition of new instructions. The user enters the mnemonic in the first text field and an RTL sequence in the second. The instruction is submitted by pressing the “Add Instruction” button.
“About” (figure 5.10)
Displays a brief description of the program, its current version and the author’s name.
In general, the classes that were used to form the CPU components were well designed with a specific purpose in mind. However, the design of the GUI classes did not use such a clean approach. There were several reasons for this. The code for the interface was written manually and required a certain degree of experimentation, especially with regards to component layout. Secondly the swing classes used for interface creation are complex yet powerful and so certain methods were changed in favour of others.
The class used for the parsing and execution of source code
was by far the most difficult to write and is perhaps far more unwieldy that it
needed to be. This has led to a certain amount of code duplication.
Implementation and testing was closely linked with the analysis and design of the software as described in Chapter 5. This chapter describes the implementation process used.
Implementation was done by integrating the prototype source into the main code on an incremental basis. The project was broken into smaller tasks that could generally be completed within a few days. Each task, usually consisted of a single class that performed a specific function. The order in which the tasks were attempted was related to their complexity and grouping. The simpler programming tasks were generally attempted earlier, whilst the more sophisticated tasks were left till later in the project. The grouping relates to whether a specific task was considered a CPU component, a GUI component or some glue code that acts as an interface between the different classes. The CPU component programming tasks were considered the core of the project and attempted first. The GUI tasks were constructed later in order to wrap around the component code effectively. It was felt that had the reverse approach been adopted, then significant alterations in the GUI code would have to be made in order to accommodate the core of the program. Immediately after integration of the prototype source, into the main code, the program was compiled and tested to see how it performed.
Certain risks can be identified with using this approach. Firstly, any alterations that modified existing and previously tested code, could introduce errors into the program. However, this risk was kept to a minimum because new code was usually added in the form of a class and required little, if no modification to the existing code. Any errors which were identified could be tracked down to the new class and amended. Regular backups of the entire project were stored on a fileserver, which ensured that a previous working copy could be retrieved if necessary.
Almost all the requirements laid down in the original specification were implemented. The exceptions were the ability to alter the size of registers to handle values of differing bit lengths and the C and V flags, which whilst partly implemented are redundant in use.
Testing took place whenever significant changes were made to the project. Most of the testing was done using the development platform, which was JDK 1.2.2 running under Windows 98. However limited testing was done using JDK 1.2.1 running under Linux to check for consistency across different operating systems.
GUI event handling was tested immediately after implementation by activating all menus and buttons under a variety of circumstances. Additionally, all keyboard shortcuts relating to the menus were tested.
The software’s operation was tested by using the four programs which are shown in Appendix I. The first program datamovement.s, tests the flow of data between the components of the CPU, to ensure that both the components and the move function, work as expected. The program operates by simply moving literals into registers and memory locations as well as moving values between these components. The program also demonstrates address register indirect addressing, absolute address and literal addressing.
The second program called arithmetic.s, tests the four main arithmetic functions used by the UESA. These are addition, subtraction, multiplication and division. The result of each instruction is placed in a consecutive data register, to verify their validity.
The third program called flag.s, was used to test two of the processor flags, Z (Zero) and N (Negative). Arithmetic instructions are used to produce a result which alters the state of these flags. A move instruction will also alter the state of these flags, but is obviously dependant the on the value being moved. This program also tests the three main equality functions, which are CMP (Compare), SGT (Set if Greater Than) and SLT (Set if Less Than). These equality functions alter the state of the Z flag to indicate the truth of the instruction. For example, testing two values for equality will set the Z flag to ‘ON’ if they are indeed equal, otherwise it will cleared.
The fourth program called factorial.s, tests several areas of the instruction set. Firstly, it proves that the UESA can be used to generate useful results, by calculating the factorial of five. Secondly, the factorial is calculated in a functional manner, which means that it exploits branching as part of the calculation. Finally, the movement and arithmetic functions are tested, because they are an integral part of the calculation.
These four programs are supplied with the system so that the user can verify its operation. All four programs make use of the pre-defined instruction set which has been implemented purely for the user’s convenience.
Several user programs have been tested on the system throughout its development, however it is impossible to test every combination of instructions. Nevertheless, programs were written which tested the robustness of error reporting with malformed programs. In general this is quite successful, with most problems generating an error response. For example, warnings are given for references to out of range registers and memory locations, or malformed labels. An error is not issued for every possible problem, however the worst case scenario is that the program being executed simply terminates prematurely. The reason for this early termination is because the cause is not immediately obvious to the system.
The testing phase did in fact pick up one fairly serious bug. The bug was to related to the move function which can be used as part of an instruction definition. The move function worked fine, providing it was used as the last function in an instruction definition. For example, and instruction which uses the move function as its last function may be defined as:
[D]<-[#]+[#]
Add two literals and move the result into a data register.
However, problems were encountered if the move function, was used in the middle of an RTL construct, as shown below.
[D]<-[D]<-[D]<-[#]
Moves a literal into three data registers.
The effect was that the literal would only be moved into one of the data registers, instead of all three. The cause of the problem, was that the move function did not push any result back onto the stack. So consequently, any consecutive functions would be lacking one of their values and therefore would not work correctly. This has now been fixed.
This chapter evaluates the software product and the project as a whole.
The user guide for the UESA is attached in Appendix G, which is actually a printout of the supplied HTML documentation.
The user interacts with the system by using a GUI that exploits the interface components with which most people are familiar. The user is guided through the system by using a series of menus and buttons, which can be activated by the keyboard and mouse, whichever is their preferred choice. Documentation is provided in a standard cross-platform file format (HTML), which describes how to install and use the system.
The simulator successfully executes the supplied example programs shown in Appendix I. During program execution, the user is provided with a view of the CPU which changes dynamically during program execution. A program can be executed in its entirety, to view the final results, or stepped through to monitor the execution of each instruction. The user can choose to display a memory dump, even during program execution, showing the contents of all memory locations.
The user can define new instructions within the program, however there is no facility to save the instruction set. All changes are lost, once the program quits. Instructions are defined using RTL and a list is presented showing all currently available instructions. The user is warned about any attempt to duplicate an instruction mnemonic.
The system has been thoroughly tested and in general seems to be robust. At the time of writing, I do not know of any serious problems with the system. There is a slight problem with the graphic on the main window, it is not always displayed immediately when the window first opens. However, this was considered ephemeral to project and therefore not fixed.
More time could have been devoted to the design of the system, with the original design of classes being specified in UML before their implementation.
The use of flowcharts for describing the processes involved in the parsing routine are perhaps too abstract from the actual source code, which has led to a certain amount of code duplication. However, it is believed that using a more concrete notation would have had a detrimental effect on the project, due to time constraints. This in turn would lead to less of the project reaching the implementation stage.
I believe I have gained a lot of experience in programming and computer architecture, by working on this project. Before the project I had virtually no experience with an object-orientated programming language, and now feel confident with both the methodology and terminology. In addition, I have learned an entirely new programming language as a by-product of the work.
A CPU simulator with an extensible architecture has been successfully demonstrated and proven to work. The system allows the user to define a complete instruction set as well as several components of the CPU. The system has a functional user interface which can be executed on a variety of platforms.
The system successfully executes source code contained within a text file and displays the results to the user. Source code can be executed on a step-by-step basis with the components of the CPU visible so that the user may monitor its effect. A facility is provided to display a memory dump for the user to view memory contents. The system is supported by an HTML help system, which can be viewed on any platform and is attached in Appendix G.
The system has been implemented as a Java application which has been proven to work under two operating systems and two versions of the JDK. The system is therefore considered to be portable as well as satisfying most of the project objectives.
Java has been found to be a powerful and robust language. However, it does have problems, which are inherent because of its portability. For example, using files can be quite a convoluted process because it has to deal with file system conventions for any computer, as well as a whole host of character sets. Certain constructs are also far too limiting, for example the switch statement, which leads to excessive and sometimes repetitive coding.
The current implementation has no method for the storage and retrieval of architectures defined by the user. Any architecture defined, including the instruction set, is lost once the program exits. Clearly, a way of saving and loading would need to be implemented for any future version.
Whilst RTL, or at least a notation based on RTL has been proven to work effectively for instruction definition to a certain degree, it has several awkward limitations. Firstly, any instruction that is defined using RTL has a restricted operand set.
That is, all the operands must of be of the data types, expressed in the RTL sequence. If the sequence specifies a literal, then an address cannot be used instead. A far cleaner method would be able to define the basic operation of the instruction, with the operand values and their data types being recognised at run-time. Secondly, certain types of instruction cannot easily be expressed within the scope of RTL. For example, an instruction which tests a pair of values for equality, then sets a processor flag to indicate its truth. For any similar project, it would be far more preferable to create a new notation for instruction definition, which would address these concerns.
The interface used to define instructions lacks two features that would be convenient for instruction definition. Once a new instruction has been created, there is no way to remove it from the instruction set. You are effectively stuck with it, until the system exits. There is also no method by which an instruction’s definition can be edited which means that mistakes cannot be rectified.
The current method of performing a branch, by specifying an absolute line number, is cumbersome an awkward to use. In any program of significant size, it would become entirely impractical. Changing one single line of code, would have the potential of completely destroying the flow of the program. The original intention was to use labels, however this could not be completed in the time available. Even so, a better solution would be to store the program in memory, and implement a real program counter.
A fully-fledged user extensible architecture would have to take into consideration far more factors than have been here. The processor clock frequency could be user-defined, cache memory could be configurable and even the ability to design multiple processors working in tandem.
|
[1] |
Bishop, Judy |
Java Gently: Programming Principles Explained, Addison-Wesley |
|
[2] |
Clements, Alan |
The Homepage of Alan Clements http://www.clements.flyer.co.uk/ |
|
[3] |
Horstmann, Cay S. & Cornelll, Gary |
Core Java 1.2: Volume 1 – Fundamentals, Prentice Hall, 1999 |
|
[4] |
Horstmann, Cay S. & Cornelll, Gary |
Core Java 1.2: Volume 1 – Fundamentals, Prentice Hall, 1999, Pages 2, 14-15 |
|
[5] |
- |
BNF Description, Manchester
University http://www.cs.man.ac.uk/~pjj/bnf/ebnf_bnfwebclub.html |
|
[6] |
- |
Embedded Electronics Magazine http://www.chipanalyst.com/tech_lib/embedded/esamples_c7.html |
|
[7] |
- |
M68000 8-/16-/32bit Microprocessor User’s Manual, Motorola Corporation, Section 2.1.1 |
|
[8] |
- |
M68000 8-/16-/32bit Microprocessor User’s Manual, Motorola Corporation, Section 2.1.3 |
|
[9] |
- |
PowerPC 604 RISC Microprocessor User’s Manual, Motorola Corporation & IBM Microelectronics, 1994, Page 21. |
Appendix A –
Project Specification
User Extensible System Architecture
BSc Computer Science Project Specification
The “User Extensible System Architecture” (UESA), is a processor simulator, however it differs in several respects to most other processor simulators.
The UESA will allow the user to define the standard register set contained within the processor. This UESA will contain up to two categories of standard user accessible registers. Data Registers and Address Registers. The user can define the number of Data Registers and the number of Address Registers contained within the UESA. By specifying zero Address Registers, the Data Registers will be treated as General Purpose Registers. The bit size of each register set will be configurable between 16 and 32bits. The user will be able to assign a register as a stack pointer. There will be an upper limit for both sets of registers for practical reasons. The processor will also contain a standard set of flags, similar to those found in the 68000 processor.
The amount of memory available to the processor will be configurable by the user.
The UESA will allow the user to define a simple instruction set using Register Transfer Language (RTL). The types of commands the user will be able to specify are Branches, Data Movement and some Addressing Modes.
The UESA will have a GUI to allow the user to configure their processor with ease, and will allow the user to see various states of the processor, such as the registers.
Assembly Code entered by the user will be interpreted by the UESA directly, and not compiled into Machine Code.
The chosen language for the UESA is Java, which will give the UESA maximum platform independency as well as allowing me to gain experience in Java coding and object orientated programming.

Appendix B - Augmented Backus Naur Form
Augmented Backus-Naur Form
This appendix describes the augmented BNF used for the specification of the assembly source code. The particular notation used, is similar to that described in RFC (Request For Comments) document 822, available from http://ftp.wustl.edu and many other university sites. The notation constructs are in bold italics, followed by a description.
name = definition
The name of a rule is simply the name itself without any enclosing angular brackets and is separated from its definition by the equals character.
“literal”
Quotation marks surround literal text. Unless stated otherwise, the text is case-insensitive.
rule1 | rule2
Elements separated by a bar are alternatives, for example ‘yes | no’ will accept yes or no.
(rule1 rule2)
Elements enclosed in parenthesis are treated as a single element, thus ‘(A (B | C) D)’ allows the sequences ‘ A B D ‘ and ‘A C D’.
*rule
The asterisk character preceding an element indicates repetition. The full form is ‘<n>*<m>element’ indicating at least <n> and at most <m> occurrences of the element. The default values are zero and infinity so that ‘*(element)’ allows any number of occurrences including zero.
OCTET
Any 8-bit sequence of data except CR, LF or SP
CR
US-ASCII CR, Carriage Return (13)
LF
US-ASCII LF, Line-Feed (10)
SP
US-ASCII SP, Space (32)
DIGIT
Any US-ASCII digit between “0” and “9”.
Appendix C - Sample Prototype Source Code
// Prototype source code
for representing registers
// Author: Ian Chapman
// Tested: Using JDK 1.2.2
under Windows 98
//
// Additional Notes:
// long: 8 Bytes
// int: 4 Bytes
// short: 2 Bytes
// byte: 1 Byte
import coreJava.*;
public
class RegisterProtoType
{
public static void
main(String[] args)
{
//Declarations
int x=0, y=0,
option=0,regnum=0,bytenum=0, val=0;
int[][] dreg;
dreg = new int[4][32];
//Display Menu
System.out.print("\n\nUser
Extensible System Architecture (UESA)\n");
System.out.print("------------------------------------------\n");
System.out.print("TEST PROGRAM:
1\n");
System.out.print("Model Data
Registers Using An Array\n\n");
while (option !=4)
{
System.out.print("1. Add Value
To Register\n");
System.out.print("2. Show
Register\n");
System.out.print("3. List
Registers\n");
System.out.print("4.
Quit\n");
option=Console.readInt("Enter:");
switch(option)
{
//Read
input from the keyboard and add to register
case 1:
regnum=Console.readInt("Chose
Reg Num (0-31):");
bytenum=Console.readInt("Choose
Byte Num:");
val=Console.readInt("Enter
Value:");
dreg[bytenum][regnum]=val;
break ;
//Read
input from keyboard and display register contents
case 2:
regnum=Console.readInt("Enter
Register Number (0-31):");
System.out.print("\n\nRegister
D");
System.out.print(regnum);
System.out.print(": ");
for ( y = 0; y < 4; y++)
{
System.out.print(dreg[y][regnum]);
System.out.print("
");
};
System.out.print("\n\n");
break ;
//List
all registers and their contents.
case 3:
System.out.print("Listing
Registers\n-----------------\n");
for (x = 0; x < 32; x++)
{
System.out.print("Register
D");
System.out.print(x);
System.out.print(": ");
for ( y = 0; y < 4;
y++)
{
System.out.print(dreg[y][x]);
System.out.print("
");
};
System.out.println("");
};
break ;
//Quit
case 4:
break;
default:
System.out.print("No
Such Option\n");
break;
}
};
}
}
// Prototype source code
for parsing an ASCII text file
// Author: Ian Chapman
// Tested: Using JDK 1.2.2
under Windows 98
import Java.io.*;
import coreJava.*;
import Java.util.*;
public
class ParseSourceProtoType
{
public static void
main (String[] args)
{
String str;
char ch;
int len, x=0, inststate=0, line=0;
printf("\n\nUser Extensible System
Architecture (UESA)\n");
printf("--------------------------------------------\n");
printf("TEST PROGRAM: 3\n");
printf("Parses some source code.\n");
printf(" Currently: Comments, Labels, Instructions (without
operands)\n");
printf("Reading file
test.s\n\n");
try
{
BufferedReader fin = new
BufferedReader (new FileReader("test.s"));
while(true==true)
{
inststate=0;
str=returnline(fin);
line++;
len=str.length();
len=len-1;
x=0;
ch=returnchar(str,x);
while(x < (len+1) )
{
switch (ch)
{
//Comment
Detection
case ';':
printf(str);
printf("\n->Comment
Detected at Line ");
System.out.println(line);
x=len+1;
break;
//Whitespace
detection of space characters
case ' ':
x++;
ch=returnchar(str,x);
break;
//Detection
of broken label
case ':':
printf(str);
printf("\n->Error:
Label with no name at line ");
System.out.println(line);
x=len+1;
break;
default:
System.out.print(ch);
// Determine command or label
while (inststate==0)
{
x++;
ch=returnchar(str,x);
System.out.print(ch);
inststate=insthandle(ch);
}
if (inststate==1)
{
printf("\n->Command
Detected\n");
}
else
{
printf("\n->Label
Detected\n");
}
x=len+1;
break;
}
}
}
}
catch(Exception e)
{
printf("\nEnd Of File
Reached\n");
}
}
/*
** Method to print a string to the console
*/
public static void
printf(String mystr)
{
System.out.print(mystr);
}
/*
** Reads in a line of text from the specified
** BufferedReader and returns it.
*/
public static
String returnline(BufferedReader fin)
{
String str="EMPTY";
try
{
str=fin.readLine();
}
catch(Exception e)
{
printf("End Of File
Reached\n");
}
return str;
}
/*
** Returns the character at position X in a
string.
** Checks if the position is out of bounds and
** returns an error msg if thats the case.
*/
public static char
returnchar(String mystr, int x)
{
char ch='!';
if (x>mystr.length())
{
printf("String Out of
Range\n");
}
else
{
ch=mystr.charAt(x);
}
return ch;
}
/*
** Checks whether the next character is a
space,
** colon or other character. Instructions are
** terminated with :, commands terminated with
space
*/
public static int
insthandle(char ch)
{
int val=0;
switch (ch)
{
case ' ': //Must
Be a Command
val=1;
break;
case ':': //Must
Be a Label
val=2;
break;
default: //Read
Another Character
val=0;
break;
}
return val;
}
}
Appendix D – Flowchart Notation
Notes on flowchart
diagrams.
The flow charts present an abstract overview of the parsing routine, including its flow control, decision making and process events. The components used in the diagrams, and their meanings are described below.

Figure D.1 – Description of the flowchart notation used to describe the parsing routine.
Appendix E – Initial Memory and Register Designs


Figure E.2 – Diagram to show the initial design for representing memory, which was later discarded.
Appendix F – UML Class Diagrams

(a)
(b)
(c)
(a)
(b)
(c)
(a)
(b)
(a)
(b)
(c)
(a)
(b)
Figure F.7 – (a) The parsing class responsible for
parsing and execution. (b) A glue class, which holds the users preferences
and settings.
(a)
(b)
(a)
(b)
(c)
Appendix G – HTML User Documentation
---HTML Documentation – PAGE 1 ---

Summary
The User Extensible System Architecture (UESA) is a processor simulator written in Java. The UESA has a user definable architecture that allows different components of the CPU to be configured, including the instruction set.
Installation
and Requirements
Getting Started
Architecture Configuration
Built-In Instruction Set
Defining Instructions
Source Code Layout
Executing Source Code
Installation and
Requirements
This system is a Java application and was created using JDK 1.2.2 under Windows 98. To guarantee compatibility, please make sure you have JDK 1.2.2 fully installed and working.
The distribution should contain the following files.
---HTML Documentation – PAGE 3 ---
Getting
Started
To execute the
UESA, run the file UESA.jar by
double clicking on it.
The UESA starts by
presenting you with the main window, containing a list of the default settings.
For the most part, navigation is achieved by browsing through the menus, or by
activating the keyboard shortcuts, as detailed in the menus.
To test your
installation has been successful:
---HTML Documentation – PAGE 4 ---
Architecture
Configuration
Before using the UESA, it is important to define the architecture under which the source code will run. For the most part, the default settings will suffice, however this section describes how to change the settings from the default.
Main store memory is the amount of memory available to the programs which run under the UESA. To set your memory requirements, open the "Memory Definition" window by selecting "Define Memory Size" from the "Configuration" menu, or press CTRL+M
Memory is defined as blocks, or in other words a number of locations. The default value is 10. To change this value, simply enter a new integer in the text field. Legal values are 1 to 4096. Press OK to accept the changes, or Cancel to discard them.
Registers are areas of fast memory located internally within the processor. The UESA supports two types of registers. Data Registers simply hold values which can be retrieved, whilst address registers hold a pointer to a memory location.
The alter the number of registers, open the "Register Definition" window by selecting "Define Registers" from the "Configuration" menu. The default value for both types of register is 8. To change the default enter a new integer in the text fields. Legal values for data registers are 1 to 32 whilst the legal values for address registers is 0 to 32.
Please see the section
on defining instructions.
---HTML Documentation – PAGE 5 ---
Built-In
Instruction Set
The UESA does come with a built-in instruction set, however this is purely for the convenience of running the supplied example programs, and for providing branch instructions, which cannot be defined in RTL.
The following table lists the predefined instructions, with their RTL sequence and a description of its function.
|
Instruction |
RTL Sequence |
Description |
|
MOVE.DA {dreg},{areg} |
[D]<-[A] |
Moves the contents of a memory location, pointed at by
an address register, into a data register. |
|
MOVE.AD {areg},{dreg} |
[A]<-[D] |
Moves the contents of a data register, into an address
register. |
|
MOVE.ML {mem},{int} |
[M]<-[#] |
Moves a literal, into a memory location. |
|
MOVE.AL {areg},{int} |
[A]<-[#] |
Moves a literal, into an address register. |
|
MOVE.DL {dreg},{int} |
[D]<-[#] |
Moves a literal, into a data register. |
|
MOVE.DM {dreg},{mem} |
[D]<-[M] |
Moves the contents of a memory location, into a data
register. |
|
ADD.DLL {dreg},{int},{int} |
[D]<-]#]+[#] |
Adds two literals and places the result in a data
register. |
|
ADD.DDL {dreg},{dreg},{int} |
[D]<-[D]+[#] |
Adds a literal to the value in a data register and
places the result in a data register. |
|
SUB.DLL {dreg},{int},{int} |
[D]<-[#]-[#] |
Subtracts the second literal from the second and places
the result in a data register. |
|
MULU.DLL {dreg},{int},{int} |
[D]<-[#]*[#] |
Multiplies two integers and places the result in a data
register. |
|
MULU.DDD {dreg},{dreg},{dreg} |
[D]<-[D]*[D] |
Multiples the contents of two data registers, and leaves
the result in a data register. |
|
DIV.DLL {dreg},{int},{int} |
[D]<-[#]/[#] |
Divides the first literal, by the second and places the
result in a data register. |
|
JMP {linenum} |
NONE |
Branches to the specified line number in the source
code. |
|
BINZ {linenum} |
NONE |
Branches to the specified line number in the source
code, if the Z flag is zero (OFF). |
|
BIZ {linenum} |
NONE |
Branches to the specified line number in the source
code, if the Z flag is set (ON). |
|
CMP.LL {int},{int} |
NONE |
Compares two literals, and sets (ON) the Z flag, if
they are equal. |
|
SLT.DD {dreg},{dreg} |
NONE |
Sets the Z flag (ON), if the contents of the first data
register, are less than the second. |
|
SLT.LL {int},{int} |
NONE |
Sets the Z flag (ON), if the first literal, is less
than the second. |
|
SGT.DD {dreg},{dreg} |
NONE |
Sets the Z flag (ON), if the contents of the first data
register, are greater than the second. |
|
SGT.LL {int},{int} |
NONE |
Sets the Z flag (ON), if the first literal, is greater
than the second. |
---HTML Documentation – PAGE 6 ---
Defining
Instructions
Before the UESA can be used seriously, we must define an instruction set. To open the "Instruction Definition" window, select "Define Instruction Set" from the "Configuration" menu, or press CTRL+I. The window displays a list of the currently defined instructions. We can enter a new instruction, by typing the mnemonic in the command text field, followed by the RTL sequence in the RTL text field. To actually make the instruction part of the instructions set, press ADD INSTRUCTION.
When defining an instruction using RTL, the parameters have fixed data types. For example, if you specify and 'ADD' instruction which adds two literals, then the operands must always be literals, data registers could not be substituted instead. If we needed to perform addition on two data registers, then a new instruction would have to be defined. In light of this, a proposed convention is to add an extension to the instruction, which gives a clue as to what type of operands it expects. So in the case of the addition of two literals, with the result being placed in a data register, we would define the mnemonic as 'ADD.DLL'. The extension would indicate that the first operand is a data register, the second a literal and the third also a literal.
RTL Form used by the
UESA.
The UESA understands an RTL sequence, made from the following constructs.
|
Construct |
Description |
|
[A] |
An address register |
|
[D] |
A data register |
|
[M] |
The contents of a memory location |
|
[#] |
An integer literal. |
|
Dest <- source |
Move the source to the destination |
|
- |
Subtract |
|
+ |
Add |
|
/ |
Divide |
|
* |
Multiply |
Example Instruction 1
Define an instruction called MOVE.DL, which moves a literal into a data register.
Mnemonic: MOVE.DL
RTL Sequence: [D]<-[#]
Define
an instruction called ADD.DLL, which
adds two literals, and moves the result into a data register
Mnemonic: ADD.DLL
RTL Sequence: [D]<-[#]+[#]
Define an instruction called MOVE.DM, which moves the contents a memory location, into a data
register
Mnemonic: MOVE.DM
RTL Sequence: [D]<-[M]
Define an instruction called MULSUB.ALLL, which subtracts one literal from another, performs
multiplication on the result, and leaves the answer in an address register
Mnemonic: MULSUB.ALLL
RTL Sequence: [A]<-[#]*[#]-[#]
Define an instruction called MULU.MDD, which multiplies the contents of two data registers and
leaves the result in a memory location
Mnemonic: MULU.MDD
RTL Sequence: [M]<-[D]*[D]
---HTML Documentation – PAGE 7 ---
Source Code
Layout
In order for the UESA to successfully execute your source code, the following rules must be adhered to.
---HTML Documentation – PAGE 8 ---
Executing
Source Code
Before any source code can be executed, you must choose the file to process. To do this, select "Open Source", from the "File Menu", or press CTRL+O. Execution actually takes place from the "Run Source" window, which can be opened by selecting "Run Source..." from the "File Menu" or by pressing CTRL+R.
The left hand side of the window shows the list of currently defined data registers, and their contents. Similarly, on the right hand side is the list of defined address registers. The list in the centre of the window is responsible for displaying the source code being executed and error messages which are generated. Underneath the list are the processor flags and their current state.
The buttons at the bottom of the window, perform the following functions.
|
Button |
Function |
|
Go! |
Executes the entire source code. All registers and
processor flags are updated automatically, to reflect any changes. |
|
Step Through |
Executes one line of source code, each time it is
pressed. All registers and processor flags are updated automatically. |
|
Show Memory |
Displays a memory dump in the list, which shows all
memory locations and their contents. |
|
Reset |
Clears all registers, processor flags and memory
locations. It also resets the current position in the source code, back to
the start. |
|
Close |
Closes the "Run Source" window. |
Appendix H –
User Interface Designs.

Figure H.1 – The interface design for the main window.

Figure H.2 – The interface design for the memory definition window.

Figure H.3 – The interface design for the register definition window.

Figure H.4 – The interface design for the instruction definition window.

Figure H.5 – The interface design for the run source window.
Appendix I –
Assembly Test Programs.
Program 1.
;Program to test arithmetic operations
;Results are placed in consecutive data registers
;
;Add two literals
ADD.DLL
D0, #50, #50
;Subtract one literal from another
SUB.DLL
D1, #200, #100
;Multiply two literals
MULU.DLL
D2, #10, #10
;Divide one literal by another
DIV.DLL
D3, #1000, #10
Program 2.
;Program to test the movement of data into Registers and memory
locations
;
;MOVE a literal to a Data Regiser
MOVE.DL
D0, #5
;MOVE a literal to an Address Register
MOVE.AL
A0, #10
;Move a literal to a memory location
MOVE.ML
$5, #15
;Move contents of a data register into an address register
MOVE.AD
A1, D0
;Move contents of a memory location, pointed at by an
;address register into a data register
MOVE.DA
D1, A1
;Move contents of a memory location into a data register
MOVE.DM
D2, $5
Program 3.
;Calculates the factorial of 5 and leaves result in D0
;
;Initialises running total register at 1
MOVE.DL
D0, #1
;Initialises the loop counter register at 1
MOVE.DL
D1, #1
;Moves the factorial to calculate
MOVE.DL
D2, #5
;Multiply the loop by the running total and store in the running
total
MULU.DDD
D0, D0, D1
;Increment the loop counter
ADD.DDL
D1, D1, #1
;Checks if loop counter is greater than the factorial
SGT.DD
D1, D2
;Branch if the loop register is lower than the factorial
BINZ #9
Program 4.
;Program to test the processor flags
;Z (Zero) and N (negative)
;
;Sets Z flag (because result is zero)
ADD.DLL
D0, #-5, #5
;Clears Z flag (because result is not zero)
ADD.DLL
D1, #5, #5
;Sets N flag (because result is negative)
SUB.DLL
D2, #10, #20
;Clears N flag (because result is positive)
SUB.DLL
D3, #10, #5
;Sets the Z flag (because operands are equal)
CMP.LL
#10, #10
;Clears the Z flag (because operands are not equal)
CMP.LL
#5, #10
;Sets the Z flag (because operand 1 is greater)
SGT.LL
#15, #10
;Clears the Z flag (because operand 1 is less)
SGT.LL
#10, #15
;Sets the Z flag (because operand 1 is less)
SLT.LL
#10, #15
;Clears the Z flag (because operand 1 is greater)
SLT.LL
#15, #10
Appendix J – Sample Source Code
The Java class which
handles parsing and execution.
import java.awt.*;
import java.io.*;
import java.util.*;
import java.awt.event.*;
import javax.swing.*;
import javax.swing.event.*;
/* Actually executes the source code and outputs any messages to
the
** listview on the Run Source Window
*/
class
ParseSource
{
public ParseSource
(String fname, DefaultListModel model, Register[] al, Register[] dl,CCR
concoreg)
{
int
m = 0;
filename = fname;
System.arraycopy(dl, 0 , dreglist, 0,
32);
System.arraycopy(al, 0 , areglist, 0,
32);
ccr = concoreg;
for (m = 0; m < 4096; m++)
{
memlist[m] = new MemoryLocation();
}
output=model;
stk = new MyStack();
stk.Stack(20);
say("***Executing
"+filename+"...");
try
{
fin
= new BufferedReader (new FileReader(filename));
fin.mark(1000);
}
catch (Exception e)
{
System.out.println("Warning:
Exception Generated.");
}
}
public void go()
{
try
{
while(true==true)
{
inststate=0;
line++;
//Get a
new line of text from source
str=returnline(fin);
//Get
length of string
len=str.length();
//Reset
x to zero for new line
x=0;
//Return
character int string str at position x
ch=returnchar(str,x);
while(x < (len) )
{
switch (ch)
{
case ';':
parseComment();
break;
case ' ':
parseSpace();
break;
case ':':
labelError();
break;
default:
c = new Character(ch);
tempstr = tempstr+c;
parseDefault();
break;
}
}
}
}
catch(Exception e)
{
say("***End ");
}
closeFile();
}
public void step()
{
try
{
inststate=0;
line++;
//Get a
new line of text from source
str=returnline(fin);
//Get
length of string
len=str.length();
//Reset
x to zero for new line
x=0;
//Return
character int string str at position x
ch=returnchar(str,x);
while(x < (len) )
{
switch (ch)
{
case ';':
parseComment();
break;
case ' ':
parseSpace();
break;
case ':':
labelError();
break;
default:
c = new Character(ch);
tempstr = tempstr+c;
parseDefault();
break;
}
}
}
catch(Exception e)
{
say("***End ");
}
}
public void
closeFile()
{
try
{ fin.close(); }
catch(IOException e)
{ System.out.println("Unable to
close file");}
}
/* Collects the
digits together for an operand, converts
** them to an integer and
places it on the stack
*/
private void
parseOperand(char optype)
{
String convtonum="";
while (ch !=',' && x<(len-1))
{
x++;
ch=returnchar(str,x);
if (ch !=',')
{
c = new Character(ch);
convtonum=convtonum +c;
}
}
switch (optype)
{
case 'A':
if
(StringToInt.strToInt(convtonum)>StringToInt.strToInt(Prefs.getAreg())-1)
{ say("***Error: Address
Register out of range"); }
break;
case 'D':
if
(StringToInt.strToInt(convtonum)>StringToInt.strToInt(Prefs.getDreg())-1)
{ say("***Error: Data
Register out of range"); }
break;
case '$':
if
(StringToInt.strToInt(convtonum)>StringToInt.strToInt(Prefs.getMemSize())-1)
{ say("***Error: Memory
Location out of range"); }
break;
}
stk.push(StringToInt.strToInt(convtonum));
convtonum=""
}
/*Parse the operand set
and breaks it up into individual
** operands. It knows
about the number of operands an
** instruction should
have, and detects the number of
** operands actually used
in the source by detecting
** the commas.
*/
private void
parseOperandSet (Instruction cmd)
{
int operandnum=cmd.getNumOps();
int numparsed=0;
while(x < (len-1) && numparsed
< operandnum)
{
x++;
ch=returnchar(str,x);
switch (ch)
{
case 'A':
parseOperand('A');
numparsed++;
break;
case 'D':
parseOperand('D');
numparsed++;
break;
case '#':
parseOperand('#');
numparsed++;
break;
case '$':
parseOperand('$');
numparsed++;
break;
case ' ':
break;
default:
say("***Errors with
Instruction Operands");
break;
}
}
execute(cmd);
}
/* Displays the comment to
the screen and jumps to the
** end of line, in
preperation for reading the next
** line
*/
private void
parseComment()
{
i = new Integer(line);
say(str);
x=len+1;
}
/* Simply ignores
whitespace space characters
*/
private void
parseSpace()
{
x++;
ch=returnchar(str,x);
}
/* Outputs an error to say
that a label has been
** detected without a name
*/
private void
labelError()
{
i = new Integer(line);
say(str);
say("***Error: Label with no name at
line "+i);
x=len+1;
}
/* Simple method to output
a string to the list on
** the run source window.
*/
private void
say(String s){output.addElement(s);}
/* Sends the string
to checkCommand to see if it is
** an instruction that
exists. If not it outputs an
** error, otherwise it
hands the instruction over
** for operand parsing.
*/
private void
parseCommand()
{
Instruction cmd;
i = new Integer(line);
say(str);
cmd=checkCommand(tempstr);
tempstr="";
if (cmd != null)
{
parseOperandSet(cmd);
}
else
{
say("***Unknown
Instruction");
}
}
/* Displays the label to
the screen and adds it to the
** label array for future
use. Currently labels are not
** used for branching
*/
private void
parseLabel()
{
i = new Integer(line);
say(tempstr);
labellist[numlabel]
= new Label(tempstr, line);
numlabel++;
say("***Label Detected at line
"+i);
tempstr="";
}
/* Reads characters from
the file until a terminating
** space is found. It then
determines whether the string
** is a Mnemonic or a
label)
*/
private void
parseDefault()
{
while (inststate==0)
{
x++;
ch=returnchar(str,x);
if (ch !=' ')
{
c = new Character(ch);
tempstr = tempstr +c;
}
inststate=insthandle(ch);
}
if (inststate==1)
{ parseCommand(); }
else
{ parseLabel(); }
x=len+1;
}
/*
** Reads in a line of text from the specified
** BufferedReader and returns it.
*/
private static
String returnline(BufferedReader fin)
{
String str="";
try
{ str=fin.readLine(); }
catch(Exception e)
{
System.out.println("***End"); }
return str;
}
/*
** Returns the character at position X in a
string.
**
Checks if the position is out of bounds and
** returns an error msg if that’s the case.
*/
public static
char returnchar(String mystr, int x)
{
char ch='!';
if (x>mystr.length())
{ System.out.println("String Out
of Range"); }
else
{ ch=mystr.charAt(x); }
return ch;
}
/*
** Checks whether the next character is a
space,
** colon or other character. Instructions are
** terminated with :, commands terminated with
space
*/
private static
int insthandle(char ch)
{
int val=0;
switch (ch)
{
case ' ': //Must
Be a Command
val=1;
break;
case ':': //Must
Be a Label
val=2;
break;
default: //
Read Another Character
val=0;
break;
}
return val;
}
/* Checks to see whether
the command does exist
** and has been defined.
It does this by comparing
** the Mnemonic from the
source to the one in the
** instruction array. If
it does exist, it returns
** the command
*/
private Instruction
checkCommand(String s)
{
int instnum, x=1;
String teststr;
boolean exists=false;
Instruction cmd=null;
instnum = Prefs.getNumInst();
while (exists==false && x <
instnum+1)
{
cmd = Prefs.getInst(x);
teststr=cmd.getName();
if(s.equals(teststr))
{ exists=true; }
x++;
}
if ( exists == false)
{ cmd = null; }
return cmd;
}
/* Executes an instruction
by pulling the operands off the stack,
** and obtaining their
datatypes by looking at the instructions
** operand array. The
functions performed on those operands and are
** done by looking in the
instructions function array.
*/
private void
execute(Instruction cmd)
{
//Pointer to
element in operand array
opptr = 0;
int numberofoperands = cmd.getNumOps();
int numberoffuncs = cmd.getNumFunc();
//funcptr is a
pointer to element in function array
int functype = 0, funcptr = 0, val1,
val2;
int[] flst = cmd.getFuncList();
int[] olst = cmd.getOpList();
//Set to true if
the result of a function has been placed
//on the stack to be
used in the next.
boolean resonstack=false;
onstacktype=0;
//Cycles through the
function array executing each function
//in turn
for (funcptr = 0;
funcptr<numberoffuncs; funcptr++)
{
switch (functype = flst[funcptr])
{
//Perform an
add function with two operands, in the
//case where
the result of a previous function is
//on the stack,
use that. Put result on stack
case ADD:
if (resonstack == true)
{ val1 =
resolveValue(onstacktype,false);}
else
{ val1 =
resolveValue(olst[opptr],true); }
val2 = resolveValue(olst[opptr],true);
stk.push(val1 + val2);
System.out.println(val1+val2);
resonstack = true;
onstacktype = LITERAL;
break;
//Perform a sub
function with two operands, in the
//case where
the result of a previous function is
//on the stack,
use that. Put result on stack
case SUB:
if (resonstack == true)
{ val1 =
resolveValue(onstacktype, false); }
else
{ val1 =
resolveValue(olst[opptr], true); }
val2 =
resolveValue(olst[opptr], true);
stk.push(val2 - val1);
resonstack = true;
onstacktype = LITERAL;
break;
//Perform a mul
function with two operands, in the
//case where
the result of a previous function is
//on the stack,
use that. Put result on stack
case MUL:
if (resonstack == true)
{ val1 =
resolveValue(onstacktype, false); }
else
{ val1 =
resolveValue(olst[opptr], true); }
val2 =
resolveValue(olst[opptr], true);
stk.push(val1 * val2);
resonstack = true;
onstacktype = LITERAL;
break;
//Perform a DIV
function with two operands, in the
//case where
the result of a previous function is
//on the stack,
use that. Put result on stack
case DIV:
if (resonstack == true)
{ val1 = resolveValue(onstacktype,
false); }
else
{ val1 =
resolveValue(olst[opptr], true); }
val2 =
resolveValue(olst[opptr], true);
stk.push(val2 / val1);
resonstack = true;
onstacktype = LITERAL;
break;
//Perform a
move function to a memory location, date or
//address
register. Update the condition code register
//to reflect
the result.
case MOV:
if (resonstack==true)
{ val1 =
resolveValue(onstacktype, false); resonstack=false;
}
else
{ val1 = resolveValue(olst[opptr],
true); }
val2 = stk.pull();
switch (olst[opptr])
{
case DATAREG:
dreglist[val2].setValue(val1);
onstacktype=DATAREG;
opptr++;
break;
case ADDRREG:
areglist[val2].setValue(val1);
onstacktype=ADDRREG;
opptr++;
break;
case MEMLOCA:
memlist[val2].setValue(val1);
onstacktype=MEMLOCA;
opptr++;
break;
}
if (val1 == 0)
{ccr.setZOn();}
else
{ccr.setZOff();}
if
(val1 < 0)
{ccr.setNOn();}
else
{ccr.setNOff();}
resonstack=true;
stk.push(val2);
break;
//Jump tp a
specific line in the source code by resetting
//the source
back to the start and read lines, discarding them,
//until the
jump line is reached.
case JMP:
try
{
int t, jumpto;
jumpto = stk.pull();
fin.reset();
for (t = 0; t<jumpto
-1; t++)
{
fin.readLine();
}
line = jumpto - 1;
ccr.setNOff();
ccr.setZOff();
}
catch (IOException e)
{
System.out.println("Error
with Branch");
}
break;
//Set the Z flag if value 1 is
greater than value 2
case SGT:
val1 =
resolveValue(olst[opptr], true);
val2 =
resolveValue(olst[opptr], true);
if (val2 > val1)
{ccr.setZOn();}
else
{ccr.setZOff();}
ccr.setNOff();
break;
//Set the Z
flag if value 1 is less than value 2
case SLT:
val1 = resolveValue(olst[opptr],
true);
val2 =
resolveValue(olst[opptr], true);
if (val2 < val1)
{ccr.setZOn();}
else
{ccr.setZOff();}
ccr.setNOff();
break;
//Check the Z
flag and branch if it is zero. BINZ
//means Branch
If Not Zero, i.e. branch if the last
//instruction
did not produce a zero
case BINZ:
if (ccr.getZ() == false)
{
try
{
int t, jumpto;
jumpto = stk.pull();
fin.reset();
for (t = 0;
t<jumpto -1; t++)
{
fin.readLine();
}
line = jumpto - 1;
}
catch (IOException e)
{
System.out.println("Error
with Branch");
}
}
ccr.setNOff();
ccr.setZOff();
break;
//Check the Z
flag and branch if it is 1. BIZ
//means Branch
If Zero, i.e. branch if the last
//instruction
produced a zero
case BIZ:
if (ccr.getZ() == true)
{
try
{
int t, jumpto;
jumpto = stk.pull();
fin.reset();
for (t = 0;
t<jumpto -1; t++)
{
fin.readLine();
}
line = jumpto - 1;
}
catch (IOException e)
{
System.out.println("Error
with Branch");
}
}
ccr.setNOff();
ccr.setZOff();
break;
//Compare
function, comparses two values and sets the Z
//flag if they
are equal.
case CMP:
val1 =
resolveValue(olst[opptr], true);
val2 =
resolveValue(olst[opptr], true);
if (val1 == val2)
{ccr.setZOn();}
else
{ccr.setZOff();}
ccr.setNOff();
break;
}
}
}
/* Resolves a value
obtained from the stack and returns the real value.
** For example, the number
1 may be on the stack, this could mean a
** data register or the
literal 1. In the case of a data register it
** would obtain the value
in D1 and return that as the real value.
*/
private int
resolveValue(int v, boolean doinc)
{
int value = 0;
switch(v)
{
//Simply return the
raw value
case LITERAL:
value = stk.pull();
onstacktype=LITERAL;
break;
//Return the
contents of the data register
case DATAREG:
value =
dreglist[stk.pull()].getValue();
onstacktype=DATAREG;
break;
//Return the
contents of a mem location pointed at by the
//address register
case ADDRREG:
value =
areglist[stk.pull()].getValue();
value =
memlist[value].getValue();
onstacktype=ADDRREG;
break;
//Return the contents of a memory location.
case MEMLOCA:
value = memlist[stk.pull()].getValue();
onstacktype=MEMLOCA;
break;
//Currently - do nothing.
case LABEL:
onstacktype=LABEL;
break;
}
//Increment the
operand array pointer to the next element.
if (doinc==true)
{opptr++;}
return value;
}
/* Displays a
memory dump on the "Run Source" window.*/
public void
memoryDump()
{
int numloc =
StringToInt.strToInt(Prefs.getMemSize())-1;
int x;
Integer i, n;
say(" ");
say("***Performing Memory
Dump...");
//Cycles through
the memory location array and outputs the contents
//of each memory
location.
for (x = 0; x < numloc+1; x++)
{
i = new Integer(x);
n = new
Integer(memlist[x].getValue());
say("Address:
"+i+" Contents: "+n);
}
}
private int
onstacktype=0;
private Integer i;
private int line=0;
private String str;
private int len;
private int x;
private DefaultListModel
output;
private char ch;
private Character c;
private String
tempstr="";
private int inststate=0;
private MyStack stk;
private Register[] dreglist
= new Register[32];
private Register[] areglist
= new Register[32];
private MemoryLocation[]
memlist = new MemoryLocation[4096];
private Label[] labellist =
new Label[30];
private int numlabel = 0;
private int opptr=0;
private final int ADD = 0;
private final int SUB = 1;
private final int MUL = 2;
private final int DIV = 3;
private final int MOV = 4;
private final int JMP = 5;
private final int SGT = 6;
private final int SLT = 7;
private final int BINZ = 8;
private final int BIZ = 9;
private final int CMP = 10;
private final int LITERAL =
0;
private final int DATAREG =
1;
private final int ADDRREG =
2;
private final int MEMLOCA =
3;
private final int LABEL =
4;
private BufferedReader fin;
private CCR ccr;
private String filename;
}