Difference between revisions of "Extending Disassembly with Ozone"

From SEGGER Wiki
Jump to: navigation, search
(init)
Line 68: Line 68:
 
Next to the predefined script functions, users are free to add their own functions
 
Next to the predefined script functions, users are free to add their own functions
 
to disassembly support scripts in order to structure the code.
 
to disassembly support scripts in order to structure the code.
 
== Overriding Known Instructions ==
 
 
Next to providing the disassembly of custom instructions,
 
a disassembly plugin can also alter the disassembly of known instructions.
 
For this purpose, script command <code>Debug.enableOverrideInst</code>
 
is provided. The usage of this command is described further below.
 
   
 
== Example Implementation ==
 
== Example Implementation ==

Revision as of 20:20, 23 October 2019


Introduction

With the advent of custom instructions in Armv8-A in mid 2019, all of Ozone's supported target architectures now allow MCU core vendors to add custom instructions to their design.

Considering this technological development, it became highly desirable to supply customers a tool for extending Ozone's instruction set knowledge on a particular architecture as well.

Since version 2.71a, Ozone supports custom instructions via disassembly support plugins.

Disassembly Plugins

A disassembly plugin extends Ozone's disassembler by:

  1. providing the assembly code of custom instructions.
  2. providing numerical information about custom instructions, such as the PC branched to. Ozone broadly relies on numerical instruction information in multiple areas, such as its call graph window.

Disassembly plugins are written in JavaScript. All of JavaScript's basic language constructs are supported. Ozone poses a single requirement on disassembly plugins which is that all script code must be contained within functions.

Loading The Plugin

Command Project.SetDisassemblyPlugin is provided to load a disassembly plugin. When this command is placed into project file function OnProjectLoad, the plugin will be loaded each time the project is opened. The command has a single argument, which is the file path.

Users may alternatively execute action Set Script of the disassembly window context menu in order to load a disassembly plugin. When executed, this action will also edit the project file accordingly.

Script Functions Overview

A disassembly plugin consists of 3 predefined functions:

Function Description Executed When Optional
init Performs initialization tasks plugin load Yes
printInstAsm Returns the disassembly text of a custom (or overridden) instruction on-demand Yes
getInstInfo Returns numeric information about a custom (or overridden) instruction, such as the PC branched to program file load Yes

Next to the predefined script functions, users are free to add their own functions to disassembly support scripts in order to structure the code.

Example Implementation

This section provides an example implementation which adds support for a custom instruction on a RI5CY RISC-V MCU core.

init

A disassembly plugin implementation typically starts with script function init. This function is called when the disassembly plugin is loaded. The main purpose of function init is to provide a place where instruction overrides using command Debug.enableOverrindeInst can be defined. An instruction override allows users to alter the disassembly and numerical information of a known instruction.

/*********************************************************************
*
*       init
*
*  Function Description
*    Called by Ozone when the script was loaded 
*    (i.e. when command "Project.SetDisassemblyPlugin" was executed).
*
*    Typical usage: executes one or multiple "Debug.enableOverrideInst"
*    commands which define the instructions whose default disassembly 
*    is to be overridden by this plugin.
*
*  Return Value
*     0:  OK
*    -1:  error
*/
function init() {

  var InstLen;
  var InstData = new Array();
  var InstMask = new Array();
  //
  // Mark the instruction "ADDI sp, sp, -16" (0x1141) as overridden by this plugin
  //
  InstLen     = 0x2;
  InstData[0] = 0x41;
  InstData[1] = 0x11;
  InstMask[0] = 0xFF; // all encoding bits are relevant
  InstMask[1] = 0xFF; // all encoding bits are relevant

  Debug.enableOverrideInst(InstLen, InstData, InstMask);

  return 0;
}

This example implementation of init overrides the instruction with integer encoding 0x1141.

printInstAsm

Next, we implement function printInstAsm in order to:

  • provide the disassembly of custom instruction "P.BEQIMM" with integer encoding 0x06362A63
  • provide the disassembly of overridden instruction 0x1141
/*********************************************************************
*
*       printInstAsm
*
*  Function Description
*    Prints the assembly code of an instruction.
*
*  Function Parameters
*    InstAddr     instruction address (type: U64)
*    InstLen      instruction byte length (type: U32)
*    InstData     instruction bytes (type: byte array)
*    Flags        basic information about the instruction required for analysis. 
*                 Interpretation depends on architecture. 
*
*  Return Value
*    undefined:   if the input instruction is not supported by this plugin
*    string:      assembly code string containing a single tab after the instruction
*                 mnemonic and a single tab before a possible trailing comment
*/ 
function printInstAsm(InstAddr, InstLen, InstData, Flags) {

  if (InstLen == 4) {

    var Encoding = (InstData[3] << 24) | (InstData[2] << 16) | 
                   (InstData[1] << 8)  | InstData[0];

    if ((Encoding & 0x707F) == 0x2063) { // opcode == "P.BEQIMM" ?
      // 
      // "P.BEQIMM" is a PC-relative conditional branch
      //
      // Operation:
      //     If (Rs1 == Imm5) branch to InstAddr + (Imm12 << 1).
      //
      // Encoding = {Imm12 | Imm5 | rs1 | funct3 | Imm12 | opcode}
      //            ----------------------------------------------
      //            [31:25]               [14:12]          [6:0]     
      //            ----------------------------------------------     
      //              -        -     -      010      -    1100011
      //
      var a       =  (Encoding & 0x80)       >> 7;   // Encoding[7:7]
      var b       =  (Encoding & 0xF00)      >> 8;   // Encoding[11:8]
      var c       =  (Encoding & 0x7E000000) >> 25;  // Encoding[30:25]
      var d       =  (Encoding & 0x80000000) >> 25;  // Encoding[31]
      var Imm5    =  (Encoding & 0x1F00000)  >> 20;
      var Rs1     =  (Encoding & 0xF8000)    >> 15;

      var Imm12   =  (b | (c << 4) | (a << 11) | (d << 12)) << 1;

      var sSymbol =  Debug.getSymbol(InstAddr + Imm12);

      var sInst   = "P.BEQIMM\t" + getRegName(Rs1) + ", " + Imm5 + ", " + Imm12;

      if (sSymbol == "") {
        return sInst;
      } else {
        return sInst + "\t; " + sSymbol;
      }
    }

  } else if (InstLen == 2) {
    var Encoding = (InstData[1] << 8) | InstData[0];
    if (Encoding == 0x1141) { // "ADDI sp, sp, -16" ?
      return "ADDI\tsp, sp, -0x10";
    }
  }
  return undefined;
}

A typical implementation of printInstAsm will be largely based on integer arithmetic, as this example illustrates. The example executes a single debugger API function call with Debug.getSymbol. This API function returns the name of the symbol at or preceding the input address. The symbol name is appended as comment to the returned assembly code text. Function getRegName is a user-defined script function which returns the name of a RISC-V register.

The text returned by function printInstAsm must have the following format: <mnemonic>\t<operands>\t;<comment> for example: P.BEQIMM\ta2, 3, 116\t; OS_Idle

getInstInfo

We also want the disassembly plugin to provide numerical information about custom instruction "P.BEQIMM" to Ozone, such as the branch destination PC. This will allow Ozone to assemble and display correct information in areas that are based on numerical instruction information, such as the call-graph window.

The plugin delivers numerical instruction information to Ozone via script function getInstInfo.

/*********************************************************************
*
*       getInstInfo
*
*  Function Description
*    Returns numerical information about an instruction.
*
*    Used by Ozone to generate timeline stacks and call-graphs,
*    among other applications.
*
*  Function Parameters
*    InstAddr     instruction address (type: U64)
*    InstLen      instruction byte length (type: U32)
*    InstData     instruction data bytes (type: byte array)
*    Flags        basic information about the instruction required for analysis. 
*                 Interpretation depends on architecture. 
*                 See Ozone user guide for more information (type: U32)
*
*  Return Value
*    undefined:   if the input instruction is not supported by this plugin
*    InstInfo:    a javascript object corresponding to the following C structure:
*          
*    struct INST_INFO {
*      U32 Mode;         // instruction execution mode (for ex. THUMB or ARM)
*      U32 Size;         // instruction byte size
*      U64 AccessAddr;   // branch address or memory access address
*      int StackAdjust;  // Difference between SP before and after instruction execution
*      U32 Flags;        // binary instruction information
*    }
*
*  Notes
*    (1) Example input
*
*        InstAddr     0x20000192
*        InstLen      4
*        InstData     63 2A 36 06  ("P.BEQIMM a2, 3, 116")
*        Flags        0 
*/
function getInstInfo(InstAddr, InstLen, InstData, Flags) {

  if (InstLen == 4) {

    var Encoding = (InstData[3] << 24) | (InstData[2] << 16) | (InstData[1] << 8) | InstData[0];

    if ((Encoding & 0x707F) == 0x2063) { // opcode == "P.BEQIMM" ?
      // 
      // "P.BEQIMM" is a PC-relative conditional branch
      //
      // Operation:
      //     If (Rs1 == Imm5) branch to InstAddr + (Imm12 << 1).
      //
      // Encoding = {Imm12 | Imm5 | rs1 | funct3 | Imm12 | opcode}
      //            ----------------------------------------------
      //            [31:25]               [14:12]          [6:0]     
      //            ----------------------------------------------     
      //              -        -     -      010      -    1100011
      //
      var a = (Encoding & 0x80)       >> 7;   // Encoding[7:7]
      var b = (Encoding & 0xF00)      >> 8;   // Encoding[11:8]
      var c = (Encoding & 0x7E000000) >> 25;  // Encoding[30:25]
      var d = (Encoding & 0x80000000) >> 25;  // Encoding[31]

      var Imm12 = (b | (c << 4) | (a << 11) | (d << 12)) << 1;

      var InstInfo;

      InstInfo             = new Object();
      InstInfo.Size        = 4;
      InstInfo.Mode        = 0;
      InstInfo.StackAdjust = 0;
      InstInfo.AccessAddr  = InstAddr + Imm12;
      InstInfo.Flags       = 0x1110; // IsBranch | IsConditional | IsFixedAddress

      TargetInterface.message("Instruction info provided by disassembly support plugin");

      return InstInfo;

    } // if opcode == "P.BEQIMM"

  } // if InstLen == 4

  return undefined;
}

as demonstrated in the example above, numerical instruction information is returned as a JavaScript object containing a predefined set of members. The 32 bit unsigned Flags member of this object has the following bit field layout:

struct {                       
  U32 IsCtrlTransfer : 1; // Instruction possibly alters the PC
  U32 IsSoftIRQ      : 1; // Instruction is a software interrupt request
  U32 IsBranch       : 1; // Instruction is a simple branch (B, JMP, ...)
  U32 IsCall         : 1; // Instruction is a function call (Branch with Link, BL, CALL, ...)
  U32 IsReturn       : 1; // Dedicated return instruction or return-style branch (e.g. POP PC)
  U32 IsMemAccess    : 1; // Instruction reads from or writes to memory
  U32 IsFixedAddress : 1; // Branch or access address is fixed (absolute or PC-relative)
  U32 IsBP           : 1; // Instruction is a SW breakpoint
  U32 IsSemiHosting  : 1; // Instruction could be a semihosting instruction
  U32 IsNOP          : 1; // Instruction is a NOP
  U32 IsConditional  : 1; // Instruction is conditionally executed
  U32 Condition      : 4; // Condition if conditionally executed
} Flags;

This concludes the plugin example. We have seen that from a top-level perspective, a disassembly plugin consists of 3 predefined functions.

API Functions

This section summarizes Ozone JavaScript API functions which are relevant for the programming of disassembly plugins.

Debug.getSymbol

This API function returns the name of the symbol at or preceding the input address. Ozone only considers symbols of variable, constant, function and assembly label type for the return value. A typical use case for this API function is to obtain the label of a branch instruction.

Debug.getSymbol(U64 Addr)

Return Value

  • symbol name on success
  • undefined when no symbol could be found

Debug.enableOverrideInst

This API function allows plugin developers to override Ozone's build-in disassembler.

The function must be executed from script function init.

Debug.enableOverrideInst(InstLen, Encoding, Mask)

Parameter Description Type
InstLen Instruction length U32
Encoding Instruction bytes byte array
Mask Instruction bits significant for matching. This argument must have the same byte size as argument Encoding. The argument effectively allows users to override multiple instructions at once. This is commonly desirable when overriding all instructions of a particular type. byte array

Return Value

  • 0 on success
  • -1 on error

TargetInterface.peekBytes

Returns target memory data. An exemplary use case for this command is to retrieve the word at the load/store location of a custom instruction.

TargetInterface.peekBytes(Addr, Size)

Parameter Description Type
Addr Memory address U64
Size Byte size U32

Return Value

  • memory data (byte array) on success
  • undefined on error

The Flags Parameter

The 32 bit unsigned Flags function parameter of script functions printInstAsm and getInstInfo provides additional instruction information required for disassembly and analysis. The interpretation of this parameter depends on the target architecture, as explained below.

Flags on ARM

Value Description
0 Address is contained within a (code-inline) data segment
1 Address is contained within an AArch32 thumb code segment
2 Address is contained within an AArch32 ARM code segment
3 Address is contained within an AArch64 code segment

Flags on RISC-V

The Flags argument currently has no meaning on RISC-V.

Embedded Studio Compatibility

Ozone JavaScript plugins for disassembly support and RTOS awareness share a common JavaScript API. This API is described in Ozone user guide section JavaScript Classes and fully compatible with Embedded Studio. This means that a JavaScript plugin can be written once and then used with both software products.

References

[1] Ozone User Guide [2] RI5CY User Manual todo.