Linux on the Octeon

2012-03-12  ebookman

Linux on the Octeon

1. Introduction

Since the Octeon cnMIPS cores are fully MIPS64r2 compliant(服从的,顺从的) processor cores, each cnMIPS is capable of executing any of a variety of standard operating system. This document provides details of executing Linux on one or more cnMIPS cores. Although this document is Linux specific, the general principals(原则) cover should apply to any operating system ported(移植) to Octeon.

This document provides detailed information on the following aspects of Linux on the Cavium Networks Octeon processor:

2. The Kernel

The Linux kernel, as configured for Octeon, supports up to 16 way SMP with 32 bit and 64 bit userspace support. In order to allow application developers the most flexibility(灵活性), the base kernel only uses a bare(极少的,空白的) minimum of Octeon hardware units. Optional Linux kernel drivers are available for many Octeon hardware units. All units not controlled by the base kernel or optional kernel drivers, are available for application use. The following is a list of hardware resources reserved for Linux:
为linux保留的硬件资源:

  1. Console Uart  ——>可由bootloader设置哪一个uart作为console口
    • Linux uses the uart number passed from the bootloader to determine which serial port is used for the console. Which uart, and how many uarts, assigned Linux devices is dependent on the kernel configuration.
    • If the bootloader passes 0, the kernel maps Octeon's uart0 as /dev/ttyS0. Uart1 is not mapped to a serial device.
    • If the bootloader passes 1, the kernel maps Octeon's uart1 as /dev/ttyS0 and doesn't assign a ttyS to uart0.
    • If CONFIG_CAVIUM_OCTEON_2ND_KERNEL is set, the kernel maps Octeon's uart1 as /dev/ttyS0 and doesn't assign a ttyS to uart0.
    • If KGDB or CAVIUM_GDB is enabled, uart1 must not be assigned a ttyS. Instead it is reserved for debugger use.
    • If the Octeon chip supports uart2, then it is assigned a ttyS.
    • Note that ttyS numbers change based on which Octeon uarts are assigned ttyS devices.
    • Each uart can be individually controlled by modifying linux/arch/mips/cavium-octeon/serial.c. The variables enable_uart* control if each uart is given a ttyS device.

  1. CIU and Interrupts 2 and 3  :中断符号名
    • The Linux kernel automatically retrieves (检索)the information about interrupt lines 2 and 3 from the CIU. These interrupts are passed onto the kernel using the symbolic names in the arch/mips/include/asm/mach-cavium-octeon/irq.h file. These map to the bits in the CIU interrupt summary register.

  1. Mailboxes
    • The CIU mailbox registers are used for inter core SMP scheduling messages. Cores not under Linux control are free to use mailboxes, but care should be taken not to modify the mailbox registers of cores in use by the Linux kernel.

  1. TLB
    • All cores running Linux assume the TLB is under the complete control of the kernel. Applications may modify the TLB for cores not running Linux, but never of those actively running the kernel.

  1. Interrupts and Exceptions Locations
    • The Cavium Networks Octeon Bootloader provides a unique exception vector base address to each application image. Applications must read and use the MIPS exception base address here in order to insure core vectors don't overlap.

Building the Linux Kernel

The Cavium SDK provides a high level Makefile to simplify building the Linux Kernel for different targets.

Invoking make under linux directory without a target provides a help message.

Supply the build target:
    kernel               - Build the Linux kernel supporting all Cavium Octeon reference boards
    kernel-deb           - Linux kernel without the rootfs
    sim                  - Octeon simulation environment
    setup-octeon2        - Enable config options for running on OcteonII hardware
    setup-octeon2-sim    - Enable config options for running on OcteonII simulation
    flash                - Copy kernel onto compact flash at mount /mnt/cf1
    strip                - Strip symbols out of the kernel image
    tftp                 - Copy a stripped kernel to /tftpboot
    test                 - Test an existing simulator build
    clean                - Remove all generated files and the KERNEL CONFIG
 
    link-kernel-se-files - Create links of all necessary files from simple
                           executive directory, required for building kernel

To build the kernel to run on OcteonII models first invoke 'make setup-octeon2/setup-octeon2-sim', to enable the proper config options.

Invoke 'make kernel|sim' to build the Linux kernel image.

3. Embedded User Environment  ——根文件系统

Embedded Linux systems require a small root filesystem that can be put inside of the kernel or a flash device. A standard Linux distribution, Debian for example, takes hundreds of megabytes for a basic system. This can't be used with devices that only have 8-16MB of boot flash. Cavium Networks supplies embedded_rootfs to fill this need. This filesystem build system has the following features to support small systems:

  • Full cross compile of packages.
    • A standard x86 PC is used to build the filesystem. This can then be downloaded to the embedded target.
  • Small size.
    • Each package provided by Cavium Networks has been tailored to minimize the space required for the package. Executables and shared libraries are stripped of debugging symbols. Busybox replaces most command line utilities with smaller equivalents.
  • Pluggable packages.
    • All packages can be added and removed using a GUI menuing system. Dependencies between packages are respected. If you install a package requiring 64bit libraries, the 64bit libraries are automatically selected. No existing files need to be modified to add packages.
  • Simplified init scripts.
    • A desktop Linux system using System V init uses a large number of programs that are unneeded in an embedded device. This complicated system has been replaced with a single, simple rc script.
  • Multiple filesystem formats.
    • The build system supports creating a initramfs, cramfs, squashfs, ext3, ext2, and a NFS based root filesystem.
  • OCTEON II libraries.
    • OCTEON II has additional instructions and requires separate library. It is configurable through GUI menuing system to unselect installing these libraries for a specific ABI.

Building the Filesystem  制作文件系统

The top level Linux makefile automatically build the embedded root filesystem when you invoke the kernel and sim targets. If you would like to build the filesystem independently from the kernel, execute make in the linux/embedded_rootfs directory. Make without a target provides a help message.

    $ make

    menuconfig        - Configure the packages for the filesystem (GUI)
    config            - Configure the packages for the filesystem (Command line)
    oldconfig         - Check the existing config file
    all               - Build the filesystem
    squashfs          - Package the filesystem into Squashfs
    cramfs            - Package the filesystem into Cramfs
    ext3              - Package the filesystem into Ext3
    ext2              - Package the filesystem into Ext2
    initramfs         - Use the CPIO archive as an initramfs
    clean             - Delete the builds and all filesystem files
    clean-root        - Delete the filesystem files leaving the builds
    distclean         - Delete all generated files, including the config

    Currently configured packages:
    kernel-modules device-files busybox init-scripts module-init-tools libpcap
    octeon-libraries-n32 octeon2-libraries-n32 octeon-libraries-64
    octeon2-libraries-64 readline openssl zlib popt lzo bridge-utils
    ethtool mii-tool net-tools tcpdump iproute2 iputils strace schedtool
    oprofile bootoct oct-linux-identify oct-linux-mdio
    oct-linux-jtg load-llm octeon-remote-utils htlbremap toolchain-utils gdb
    mtd-tools sdk-examples intercept-example iozone rsync lockstat pciutils
    libhugetlbfs testsuite final-cleanup

For reference, the top level Linux kernel target performs a "make -s all initramfs". This creates a kernel with the filesystem inside the ELF file with the kernel. Having the filesystem inside the kernel works well with TFTP and compact flash in a development style environment. The top level Linux sim target performs "make -s all ext2". The simulator uses an ext2 filesystem directly loaded into simulator memory.

Forcing a Filesystem Build 强制制作一个文件系统

When a build of the root filesystem completes, it creates a file .root_complete. This is used as an optimization to prevent kernel builds from building the filesystem repeatedly. The build system attempts to check for makefile or configuration changes, but sometimes it misses changes that have occurred in packages. If this happens, simply delete the .root_complete file to force a filesystem to rebuild on the next make.

Tips to shrink size

This sections has moved to Linux on Small Octeon Systems - 8. Tips to shrink the root filesystem size.

Installing in Flash

This sections has moved to Linux on Small Octeon Systems - 3. Setting up the boot flash.

How to Add a Package  如何在文件系统中添加自己的包

The embedded rootfs make system is designed to allow packages to be added easily. By adding two files to the package directories, the new package will appear in the configuration menu and be built as part of the filesystem. Under pkg_kconfig, you need to add a menu entry for the package. You then add the package's makefile under pkg_makefiles. As an example, the files below are for strace.

pkg_kconfig/70-strace.kconfig

    config CONFIG_strace
    	bool "strace"
    	default y
    	help
    		From the strace homepage:
    		Strace is a system call trace, i.e. a debugging tool which
    		prints out a trace of all the system calls made by a another
    		process/program. The program to be traced need not be recompiled
    		for this, so you can use it on binaries for which you don't have
    		source.

    		System calls and signals are events that happen at the
    		user/kernel interface. A close examination of this boundary is
    		very useful for bug isolation, sanity checking and attempting
    		to capture race conditions

Note:
The sort order of the files in pkg_kconfig control the order they appear in the menu. For simplicity we use a number prefex to control ordering.
pkg_makefiles/strace.mk
PKG:=strace
VERSION:=4.5.14
DIR:=${PKG}-${VERSION}

.PHONY: all
all: build install

.PHONY: build
build: ${DIR} ${DIR}/Makefile
ifdef TOOLCHAIN_UCLIBC
        sed -i "s/#define HAVE_STROPTS_H 1/\/\/#define HAVE_STROPTS_H/g" ${DIR}/config.h
endif
	${MAKE} -C ${DIR}

${DIR}/Makefile:
	cd ${DIR} && ./configure --host=${CROSS} CFLAGS="${CFLAGS}" LDFLAGS="${TOOLCHAIN_ABI}"

.PHONY: install
install: ${DIR}
	mkdir -p ${ROOT}/usr/bin
	${STRIP} -o ${ROOT}/usr/bin/strace ${DIR}/strace

${DIR}:
	tar -jxf ${STORAGE}/${PKG}-${VERSION}.tar.bz2
	cd ${DIR} && patch -p0 < ${STORAGE}/strace.patch

Notes on makefiles for packages

  1. The make file is run in linux/embedded_rootfs/build
  2. Libraries need to be place in /usr/lib32 or /usr/lib64. The directory /usr/lib is not searched since it is for Mips O32 libraries.
  3. Use strip on all executables and libraries to save space.
  4. The following variables are export to the makefile:
    • ROOT = The directory where the filesystem is being created.
    • STORAGE = linux/embedded_rootfs/storage
    • SOURCE_DIR = linux/embedded_rootfs/source
    • ETC_FILES = linux/embedded_rootfs/etc-files
    • KERNEL_DIR = The location of the kernel source
    • CROSS = mips64-octeon-linux-gnu
    • CC = mips64-octeon-linux-gnu-gcc
    • CXX = mips64-octeon-linux-gnu-g++
    • LD = mips64-octeon-linux-gnu-ld
    • AR = mips64-octeon-linux-gnu-ar
    • RANLIB = mips64-octeon-linux-gnu-ranlib
    • STRIP = mips64-octeon-linux-gnu-strip
    • CFLAGS = Recommended C flags
    • CXXFLAGS = Recommended C++ flags
    • TOOLCHAIN_ABI = The current build ABI. Can be either -mabi=n32 or -mabi=64.
    • TOOLCHAIN_UCLIBC = Use -muclibc ABI
    • LDFLAGS = -melf32btsmipn32 or -melf64btsmip
    • OCTEON_EXTRA_CFLAGS = The current build instruction set. Can be either -march=octeon or -march=octeon2.
    • LIBDIR = The directory to install libraries

Userspace init Sequence linux启动流程

In order to simplify embedded Linux development, the userspace initialization scripts have been simplified down into one shell script. The userspace initialization process is as follows:

  1. Linux kernel mounts the root filesystem.
  2. The kernel starts the first user process /sbin/init provided by Busybox.
  3. /sbin/init reads /etc/inittab (From the SDK: linux/embedded_rootfs/etc-files/inittab).
  4. /etc/inittab starts the shell script /sbin/rc (From the SDK: linux/embedded_rootfs/etc-files/rc).
  5. /sbin/rc mounts the kernel pseudo filesystems /proc/dev/shm, and /dev/pts.
  6. /sbin/rc brings up the loopback network device with the IP address 127.0.0.1.
  7. /sbin/rc starts syslogd provided by Busybox.
  8. /sbin/rc starts telnetd provided by Busybox.
  9. /sbin/rc exits returning control in /sbin/init.
  10. /etc/inittab tells /sbin/init to spawn an interactive shell.
  11. The user interactive shell prompt appears.

4. Loading The Filesystem   加载文件系统

Octeon Simulator

Since Octeon simulation environment does not provide a virtual disk device, the root filesystem is loaded from a fixed address (0x40000000) in simulated ram. The linux kernel configures a MTD block device (/dev/mtdblock0) and mounts the filesystem from memory. The default configuration limits the in memory filesystem to 1GB.

EBT58XX Hardware Reference Board

The filesystem is built directly into the kernel as an initramfs image. After the kernel boot, the initial contents of the filesystem are extracted into a memory based filesystem (tmpfs). Once the extraction is complete, /init is called.

Contents of the Filesystem

The default filesystem is built from scratch under linux/embedded_rootfs. Performing a make ext2 in this directory will build the filesystem.

Note:
Performing a couple of the commands needed to build the root filesystem requires root privileges. The Makefile uses the sudo command to perform these steps. Insure that sudo is configured properly for your user. For example, sudo ls should work.
The Makefile copies each of the simple executive examples supporting Linux is copied into /examples in the filesystem. See the documentation for each example for directions on how to run it.

5. Running Linux on the Simulator  模拟器上运行linux

The SDK provides the shell script oct-linux to simplify the execution of Linux on the simulator. It is simply a convenience wrapper around oct-sim. In most case you will start linux with the following command line:

    $ oct-linux -quiet -noperf -numcores=#

The options "-quiet" and "-noperf" are not strictly needed, but greatly increase the simulation speed. The number of cores running Linux can be controlled by the "-numcores" argument. Here is a listing of the oct-linux script and a description of each of its parts.

    #!/bin/bash

    memory=384
    uart=2020
    packet_port=2000

    oct-sim linux/vmlinux.64 -envfile=u-boot-env -memsize=${memory}         -uart0=${uart} -serve=${packet_port}         -ld0x40000000:embedded_rootfs/rootfs.ext2 $*

  • oct-sim
    • The SDK script for executing the Octeon simulator
  • linux/vmlinux.64
    • The Linux kernel packaged as a 64 bit elf binary.
  • -envfile=u-boot-env
    • Octeon Bootloader environment to automatically start Linux. It contains the single line: bootcmd=bootoctlinux 0x10000000
  • -memsize=${memory}
    • Set the amount of simulated memory
  • -uart0=${uart}
    • The TCP/IP port the simulator will listen on for uart connections
  • -serve=${packet_port}
    • The TCP/IP port the simulator will listen for oct-packet-io to connect to.
  • -ld0x40000000:embedded_rootfs/rootfs.ext2
    • Load the filesystem binary into simulated ram at the expected MTD address.
  • $*
    • Any arguments supplied by the user.

Once Linux is running in the simulator you will start seeing the following messages from the simulator:

    waiting for a connection to uart 0 1
    waiting for a connection to uart 0 2
    waiting for a connection to uart 0 3
    waiting for a connection to uart 0 4
    

At this time you need to connect to the simulator using TCP to get the uart data. The standard program telnet works well for this. In another terminal issue the following command:

    $ telnet localhost 2020

You should now see the Linux boot messages followed by userspace initialization. After all this is complete, an interactive shell will appear.

Note:
Once Linux boots to a shell prompt, it can be useful to change telnet to character mode instead of line mode. In the telnet session press Control-] and entermode char at the prompt. Then hit enter a few times. Shell tab completion, Control-C, and other interactive aspects should now work.

6. Running Linux on the EBT58XX Hardware

Build Linux for the EBT58XX.
    $ cd $(OCTEON_ROOT)/linux
    $ make -s clean
    $ make -s kernel
    

Copy the Linux kernel to a compact flash.

    $ mkdir -p /mnt/usb
    $ fdisk -l /dev/sda         # Only needed on some Kernel 2.6 systems
    $ mount /dev/sda1 /mnt/usb
    $ mips64-octeon-linux-gnu-strip -o /mnt/usb/vmlinux.64 kernel_2.6/linux/vmlinux.64
    $ umount /mnt/usb
    

Put the compact flash into the EBT58XX and reset the board. At the bootloader prompt load linux.

    Octeon ebt5800# fatload ide 0 $(loadaddr) vmlinux.64
    Octeon ebt5800# bootoctlinux $(fileaddr)
    

Note:
Most bootloaders have an alias for this: run linux_cf
All arguments on the bootoctlinux command are passed to the Linux kernel. Two options are used frequently: mem= and root=.

  • mem= is used to set how much memory the Linux kernel uses in megabytes. Setting it to 0 will use all available memory.
  • root= is used to set the device for the root filesystem. This is used with Debian to set the filesystem to the compact flash. (root=/dev/sda2).

7. Process Context and Coprocessor 2  进程上下文和协处理器2

Linux on the Octeon saves and restores coprocessor 2 (COP2) so that it may be used freely by userspace applications.

In order to improve performance, Linux disables COP2 by default. COP2 is only enabled when a task attempts to perform a COP2 operation. At this time, the kernel enables COP2 and restores any saved state. Here is a description of kernel COP2 processing:

  1. COP2 is disabled using COP0 Status[CU2].
  2. Application runs and executes a COP2 instruction.
  3. Kernel receives an illegal COP2 access exception.
  4. Kernel enables COP2 and initializes its state.
  5. Kernel returns control to the application.
  6. Application continues processing, performing any number of COP2 instructions.
  7. Application stops processing and returns to kernel context. This can occur through a interrupt, exception, or syscall.
  8. Kernel performs a context switch by calling resume.
  9. The assembly function resume checks if COP0 Status[CU2] has enabled COP2. If so, it calls octeon_cop2_save to save COP2 state.
  10. COP2 is again disabled using COP0 Status[CU2].
  11. Normal kernel processing continues.
  12. Control returns to the application. Note that COP2 is still disabled.
  13. Application continues processing until reaching a COP2 instruction.
  14. Kernel receives an illegal COP2 access exception.
  15. Kernel enables COP2 and restores its state.
  16. Kernel returns control to the application.

Kernel use of COP2 is normally disabled since it can corrupt the userspace state. In order to access COP2 from inside the kernel, you must wrap your COP2 code with calls to the functions octeon_crypto_enable() and octeon_crypto_disable(). The following is a simplified example:

    #include <asm/octeon/octeon.h>

    void do_kernel_crypto(...)
    {
        struct octeon_cop2_state state;
        unsigned long flags;

        flags = octeon_crypto_enable(&state);

        COP2 accesses

        octeon_crypto_disable(&state, flags);
    }

The key issues to remember with kernel level COP2 access are:

  1. The structure octeon_cop2_state normally must be store on the stack. Most functions in the kernel are reentrant as they can be called on behalf of a process, in softirq, and from an interrupt.
  2. The "flags" parameter must be passed unchanged to octeon_crypto_disable(). This "flags" is independent of local_irq_save() and local_irq_restore() which also tend to use a "flags" parameter. If you need interrupts disabled you must call these independent of the COP2 enable / disable.
  3. It is possible for COP2 to be enabled outside of regions surrounded by these functions. It is unsafe to use COP2 in this instance. All kernel COP2 must be surrounded by these calls.
  4. COP2 will only be saved if necessary. Sometimes calls to these functions will not update the state parameter.

8. Process Context and CVMSEG (Local Scratch Memory) 进程上下文和CVMSEG(本地临时内存)

Userspace state in CVMSEG is saved and restored on context switch. This allows any application to use the CVMSEG memory configured with CONFIG_CAVIUM_OCTEON_CVMSEG_SIZE. Here is a description of kernel CVMSEG processing:

  1. On boot CVMSEG size is set using CvmMemCtl[LMEMSZ] = CONFIG_CAVIUM_OCTEON_CVMSEG_SIZE.
  2. Kernel access to CVMSEG is enabled using CvmMemCtl[CVMSEGENAK] = 1.
  3. User access to CVMSEG is disabled using CvmMemCtl[CVMSEGENAU] = 0.
  4. Applications runs and executes an access to CVMSEG. This could be a load, store, or an asynchronous IOBDMA.
  5. Kernel receives an invalid memory reference exception.
  6. In do_ade(), the kernel determines the address is in CVMSEG.
  7. The kernel enables user access to CVMSEG using CvmMemCtl[CVMSEGENAU] = 1.
  8. Kernel returns control to the application.
  9. Application continues processing, performing any number of CVMSEG accesses.
  10. Application stops processing and returns to kernel context. This can occur through a interrupt, exception, or syscall.
  11. Kernel performs a context switch by calling resume.
  12. The assembly function resume checks if COP0 CvmMemCtl[CVMSEGENAU] has enabled CVMSEG. If so, it saves CVMSEG state.
  13. CVMSEG is again disabled using COP0 CvmMemCtl[CVMSEGENAU].
  14. Normal kernel processing continues.
  15. Control returns to the application. Note that CVMSEG is still disabled.
  16. Application continues processing until reaching a CVMSEG access.
  17. Kernel receives an invalid memory reference exception.
  18. In do_ade(), the kernel determines the address is in CVMSEG.
  19. The kernel restore CVMSEG context and enables user access to CVMSEG using CvmMemCtl[CVMSEGENAU] = 1.
  20. Kernel returns control to the application.

Kernel use of CVMSEG must always save and restore any changes that it makes. It must issue a SYNCIOBDMA to make sure all asynchronous operations are complete before the save and before the restore.

9. Process Context and the Multiply Unit  进程上下文和乘法器

The extended multiply unit context is saved in SAVE_SOME and restore in RESTORE_SOME(stackframe.h). This occurs every time the processor switches from user to kernel context due to an interrupt, exception, or syscall. The multiply unit may be freely used from user and kernel space.

10. Accelerated Thread Local Storage (TLS) Access

On the Mips architecture, Linux implements thread local storage (TLS) using hardware register 29. GCC and Glibc use the instruction rdhwr v1, $29 to get the current value of the thread pointer. Since hardware register 29 doesn't exist on most Mips processors (including OCTEON and OCTEON Plus), the kernel traps this instruction with a Reserved Instruction Exception. The exception handler emulates the rdhwr instruction and places the current thread pointer in "v1". The overhead in emulating this non existent hardware register is very high. Because OCTEON II processor does implement hardware register 29, there is no added overhead for it to execute therdhwr v1, $29 instructions.

On Octeon, use of k0 and CVMSEG provide a much faster access to the thread pointer. Using a combination of the two, Octeon Linux accesses the thread pointer using a single dual issueable instruction. Applications heavily using TLS and threads will receive a major performance boost. The normal Mips instruction emulation, talking many hundreds of cycles, is replaced with a single cycle local access. Support for the improved access is part of the Cavium supplied toolchain.

If you are using a non Cavium toolchain, or a toolchain prior to SDK 1.5, the kernel supports dynamically replacing instructions in userspace with the faster access method. Instruction replacement is disabled on boot. It can be controlled by writing a mode to /sys/module/traps/parameters/thread_pointer_mode. The supported modes are:

  • 0 - Use the normal kernel emulation without any changes.
  • 1 - Replace emulated instructions with direct accesses to the thread register.
  • 2 - Replace emulated instructions and log the replacement PC.
  • 3 - Replace emulated instructions with break instructions. This will cause programs to fail, but makes it easy to stop gdb on the instruction.

In implementing fast TLS access, Octeon uses the k0 register in userspace and the highest allocated CVMSEG address. It is normally not valid to use the k0 register in userspace, so this will not affect userspace programs. The CVMSEG usage will cause problems for applications that use the same address as the kernel. In the case of a conflict, TLS will continue to function properly, but any data the application has placed in the last CVMSEG address will be corrupted.

Starting with SDK 2.0, if the Cavium toolchain is generating code for OCTEON II processors (by passing -march=octeon2 to the compiler), the standard TLS access using rdhwr v1, $29 is generated. However by default, when generating code for OCTEON and OCTEON Plus processors, the accelerated TLS access using k0 is used.

11. Running the Linux kernel in mapped memory.

The default configuration of the Linux kernel places the kernel in unmapped memory, and any loadable kernel modules are in a mapped memory region. This separation of the kernel and modules requires that a less efficient function calling mechanism be used in the kernel modules than in the kernel itself, resulting in a decrease in performance for some code.

The optional CONFIG_MAPPED_KERNEL kernel configuration parameter causes the kernel to be run in the mapped memory region. It also builds the kernel modules with the same efficient function calling mechanism used in the main kernel. The result can be increased performance, however one TLB entry on each core is used for the kernel mapping, thus reducing the number of TLB entries available for normal use. Reducing the number of available TLB entries in this manner, might cause a reduction in performance, so benchmarking of the intended workload should be done to determine if CONFIG_MAPPED_KERNEL actually improves performance for the case in question.

Two separate mapped kernels.

Octeon Linux can be configured to run two separate mapped kernels with each kernel using separate uart for their respective consoles. The procedure for building two separate kernels is the same as mentioned in 23. Booting Two Separate Kernels on an EBT58XX section. Also enable CONFIG_MAPPED_KERNEL config option, while building both the kernels.

Both the images are linked at the same virtual address, and the bootloader makes sure it allocates separate physical memory while loading them.

12. Per-process access control for XKPHYS

There are certain applications (like Simple Executive applications running in Linux userspace) that require direct access to XKPHYS memory and IO spaces. By default access to XKPHYS segments is disabled for all applications. Applications which need to access these segments need to call sysmips() system call to enable access.

To enable access to XKPHYS memory space the application needs to call sysmips as follows:

    sysmips(MIPS_CAVIUM_XKPHYS_WRITE, getpid(), 1, 0);

To enable access to XKPHYS IO space:

    sysmips(MIPS_CAVIUM_XKPHYS_WRITE, getpid(), 2, 0);

To enable access to both XKPHYS memory and IO space:

    sysmips(MIPS_CAVIUM_XKPHYS_WRITE, getpid(), 3, 0);

The kernel also provides config options for enabling XKPHYS access for all processes without requiring them to call sysmips(). Using these config options is not recommended. See section 23 below for a complete list of Octeon specific config options.

13. Co-existing with Simple Executive Applications

If Linux is running on a subset of the Octeon cores, it can co-exist with other operating systems and simple executive applications. Internally Linux uses the simple executive libraries for memory management, synchronization, and hardware access. For example, all memory used by Linux is allocated using the simple executive function cvmx_bootmem_alloc(). So long as each application / operating system uses the appropriate CVMX library calls for memory management and synchronization, each core can perform a completely independent task. All cores must cooperate for all shared hardware configuration. The Octeon Bootloaderdocumentation provides details about loading and starting multiple operating systems / applications. Here is a list of general guidelines:

  1. Allocate shared memory using cvmx_bootmem_alloc(). This function provides the needed synchronization so that no applications get overlapping memory.
  2. Keep core dependencies generic. Instead of allocating cores by core ID, use cvmx_sysinfo_get() to get the bitmask of cores actually running your application. Use cvmx_sysinfo_t::core_mask to determine how many cores are running your application, and use cvmx_coremask_first_core() to select the core for initialization tasks.
  3. Choose a single application to perform hardware initialization. Many initialization tasks must only be performed once.
  4. Use Octeon hardware for inter application communication. The POW with groups and the FAU unit provide fast hardware based messaging.

14. Linux Hot-Plug CPU.

If the kernel is built with the CONFIG_HOTPLUG_CPU kernel configuration parameter, Octeon CPU cores can be removed from the set of cores used by the Linux kernel. This also allows adding new cores to the existing set of cores.

See linux/kernel_2.6/linux/Documentation/cpu-hotplug.txt for details.

Cores removed from Linux can then be used to run Simple Executive applications (19. Booting Simple Executive applications from Linux using bootoct).

15. Kernel Ethernet Drivers   以太网驱动

An ethernet driver module is available in the kernel to support Octeon using the SGMII interfaces or SPI4 with a SPI4000 daughter card or XAUI interfaces. SGMII ports show up as ethernet devices eth0 through eth4, SPI4000 ports are devices spi0 through spi9 and XAUI port show up as ethernet device xaui0. The interfaces vary based on which Octeon model is being used.

In order to configure different modes of operation, the ethernet driver module supports a number of module parameters for controlling its configuration. These are set when you modprobe the ethernet driver.

参数

	$ modprobe octeon-ethernet [param=value ...]

  1. num_packet_buffers
    • Number of packet buffers to allocate and store in the FPA. By default, 1024 packet buffers are used unless CONFIG_CAVIUM_OCTEON_NUM_PACKET_BUFFERS is defined.
  2. pow_receive_group
    • POW group to receive packets from. All ethernet hardware will be configured to send incoming packets to this POW group. Also any other software can submit packets to this group for the kernel to process.
  3. pow_send_group
    • POW group to send packets to other software on. This controls the creation of the virtual device pow0. always_use_pow also depends on this value.
  4. always_use_pow
    • When set, always send to the pow group. This will cause packets sent to real ethernet devices to be sent to the POW group instead of the hardware. Unless some other application changes the config, packets will still be received from the low level hardware. Use this option to allow a CVMX app to intercept all packets from the linux kernel. You must specify pow_send_group along with this option.
  5. pow_send_list
    • Comma separated list of ethernet devices that should use the POW for transmit instead of the actual ethernet hardware. This is a per port version of always_use_pow. always_use_pow takes precedence over this list. For example, setting this to "eth2,spi3,spi7" would cause these three devices to transmit using the pow_send_group.
  6. disable_core_queueing When set the networking core's tx_queue_len is set to zero. This allows packets to be sent without lock contention in the packet scheduler resulting in some cases in improved throughput.
  7. max_rx_cpus The maximum number of CPUs to use for packet reception. Use -1 to use all available CPUs.
  8. rx_napi_weight The NAPI WEIGHT parameter.

The ethernet module assumes the Octeon hardware needs to be initialized before use. It configures the POW, FPA, CIU, PIP, IPD, PKO, and the FAU. It uses a configuration very similar to the one supplied by cvmx_helper_initialize_fpa() and cvmx_helper_initialize_packet_io_global().

When the driver is in use, applications must not reconfigure the hardware. All packets in POW group 15 will be processed by the kernel. Applications running in user mode or in the simple executive standalone environment may use this group to forward packets to the kernel for processing.

The ethernet driver, along with oct-packet-io, can be used to connect the Octeon simulator onto a real ethernet device. This allows the simulator virtual machine to appear on a network just like real hardware. Here are the steps required: 以太网驱动程序和oct-packet-io配合使用可以将模拟器连接到一个真实的以太网设备

  1. In the simulated Linux, use ifconfig to set the MAC address to match the address of the ethernet card used for a bridge. If you're willing to run the ethernet card in promiscuous mode this isn't necessary.
            simulator$ ifconfig eth0 hw ether XX:XX:XX:XX:XX:XX
    
  2. Make sure the ethernet device is up. It doesn't have to have an IP address. You may also want to put it in promiscuous mode.
            host$ ifconfig ??? up
    
  3. Use oct-packet-io to bridge ethernet packets into the simulator. Specify "-o /dev/null" to disable the logging of all packets coming out of the simulator. Note that bridging requires root privileges.
            host$ sudo oct-packet-io -p 2000 -b 0:???
    
  4. In the simulator assign the ethernet a valid IP address. If you have a DHCP server, you may be able to use udhcpc to get an address. For udhcpc to work the interface must be up.
            simulator$ ifconfig eth0 192.168.1.100
    
            or
    
            simulator$ ifconfig eth0 up
            simulator$ udhcpc -n -q -i eth0
    
  5. Ping the simulator. it should now be possible to ping the simulator from another host.
            host2$ ping 192.168.1.100
    
  6. Telnet to the simulator. Assuming everything is working, you should be able to telnet to the simulator and get a shell prompt.
            host2$ telnet 192.168.1.100
    

Note:
Since the packet interface used for bridging bypasses routing, the local host cannot reach the simulator. In order for the localhost to be able to reach the simulator, it must have two ethernet cards. The second card can then be dedicated to the simulator.
The simulated environment is much slower than real hardware. Some protocols may not function correctly due to timeouts and slow data transfer. NFS, HTTP, telnet, and ssh have all been successfully tested.

Management port Ethernet drivers

In addition to the main Ethernet ports mentioned above, the cn52XX, cn56XX, and cn63XX OCTEON processors have additional MII or RGMII Ethernet ports. These are controlled by the octeon_mgmt Ethernet driver and will be named mgmt0 and mgmt1 (if present). The driver is selected with the CONFIG_OCTEON_MGMT_ETHERNET kernel configuration parameter. There are no user settable parameters for this driver.

16. Kernel USB Drivers

Some OCTEON processors have Universal Serial Bus (USB) ports. The cn3XXX and cn5XXX OCTEON processors with USB use the octeon-hcd driver, and cn6XXX OCTEON processors use standard ehci/ohci drivers.

cn3XXX and cn5XXX USB

The octeon-hcd driver is selected with the CONFIG_USB_OCTEON_HCD kernel configuration parameter.

cn6XXX USB

The ehci/ohci USB blocks on cn6XXX OCTEON processors use the standard Linux EHCI and OHCI drivers, however additional OCTEON specific interface options must also be selected to enable these drivers to work with OCTEON processors. To enable these drivers, first select the CONFIG_USB_EHCI_HCD and CONFIG_USB_OHCI_HCD options, then select CONFIG_USB_OCTEON_EHCI and CONFIG_USB_OCTEON_OHCI.

17. Kernel Random Number Generator (RNG) Driver

The octeon-rng driver ties the OCTEON processor's hardware random number generator into the standard Linux kernel RNG framework. This driver is selected with the CONFIG_HW_RANDOM_OCTEON kernel configuration parameter.

18. Configuring NFS Root Filesystem

Configuring the Octeon Linux kernel to use NFS as the root filesystem can be more complicated than normal since the ethernet driver is a module. In order to use a NFS root filesystem, you must have an initramfs that loads the ethernet module and does a pivot root. The embedded rootfs supplied with the SDK already has all the setup required to do this. In order to enable it, you must enable NFS in the configuration system.

    $ cd linux/embedded_rootfs
    $ make menuconfig
    NFS Root filesystem  --->
    [*] Change root to a NFS filesystem
    (eth0) Network device
    [*]   Use DHCP to get the IP address, gateway, and DNS server
    (10.0.0.1:/home/debian_rootfs) NFS server path
    (nolock,rw) NFS mount options

Note:
Long delays when logging in are normally caused by the reverse lookup of your host name. Make sure it will succeed or it doesn't have a DNS server.

19. Configuring a separate initrd or initramfs Root Filesystem

The default Linux kernel when you invoke 'make kernel' will contain an embedded root filesystem. It is also possible to package this filesystem separately from the kernel image. It is then loaded into a separate memory region from the kernel.

First build the kernel, and copy it to the tftp directory.

    $ cd linux
    $ make kernel-deb
    .
    .
    .
    $ cp kernel_2.6/linux/vmlinux.64 /var/lib/tftpboot/

Then built the root filesystem image and copy it to the tftp directory. We also need to calculate the size of the memory area needed to hold the

    $ cd linux/embedded_rootfs
    $ make initramfs
    .
    .
    .
    $ ls -l rootfs.cpio.gz | gawk -- '{printf "%x\n", ($5 + 0x10000) - ($5 % 0x10000)}'
    2e0000
    $ cp rootfs.cpio.gz /var/lib/tftpboot/

Now we are ready to boot the board. This part takes some manual calculations. We need to find a free memory region in the lower 4GB of memory that can hold the initramfs. The block should be aligned on a 64KB boundary. Looking at the results of the 'freeprint' command, we see that it will fit at 0x81d0000, so we create a named block called 'my_initrd' at that location with the proper size (0x2e0000 from above). Then we load the initramfs image into this named block with the tftpboot command specifying the address of the named block. The bootloader prints a WARNING message, but we can disregard it, because we know we are loading data to our named block.

We pass the kernel command line argument 'rd_name=my_initrd' to indicate the name of the memory block to the kernel.

Octeon ebh5600# dhcp
BOOTP broadcast 1
DHCP client bound to address 10.2.0.33

Octeon ebh5600# freeprint



Printing bootmem block list, descriptor: 0x0000000000024108,  head is 0x00000000081c0940
Descriptor version: 3.0
Block address: 0x00000000081c0940, size: 0x0000000007e2d2c0, next: 0x0000000026000000
Block address: 0x0000000026000000, size: 0x00000000da000000, next: 0x0000000410000000
Block address: 0x0000000410000000, size: 0x000000000ff00000, next: 0x0000000000000000


Octeon ebh5600# namedalloc my_initrd 0x2e0000 0x81d0000
Allocated 0x00000000002e0000 bytes at address: 0x00000000081d0000, name: my_initrd

Octeon ebh5600# tftpboot 0x81d0000 rootfs.cpio.gz
Using octmgmt0 device
TFTP from server 10.1.1.1; our IP address is 10.2.0.33
Filename 'rootfs.cpio.gz'.
Load address: 0x81d0000
Loading: #####################
done
Bytes transferred = 3000755 (2dc9b3 hex), 6398 Kbytes/sec
WARNING: Data loaded outside of the reserved load area, memory corruption may occur.
WARNING: Please refer to the bootloader memory map documentation for more information.

Octeon ebh5600# tftpboot $(loadaddr) vmlinux.64
Using octmgmt0 device
TFTP from server 10.1.1.1; our IP address is 10.2.0.33
Filename 'vmlinux.64'.
Load address: 0x20000000
Loading: #################################################################
	 #################################################################
	 #################################################################
	 #################################################################
	 #################################################################
	 #################################################################
	 ###############################
done
Bytes transferred = 60285919 (397e3df hex), 6930 Kbytes/sec

Octeon ebh5600# bootoctlinux $(loadaddr) numcores=$(numcores) endbootargs rd_name=my_initrd mem=1024M
argv[2]: numcores=12
argv[3]: endbootargs
ELF file is 64 bit
Attempting to allocate memory for ELF segment: addr: 0xffffffff81100000 (adjusted to: 0x0000000001100000), size 0x9fc080
Allocated memory for ELF segment: addr: 0xffffffff81100000, size 0x9fc080
Processing PHDR 0
  Loading 984c80 bytes at ffffffff81100000
  Clearing 77400 bytes at ffffffff81a84c80
## Loading Linux kernel with entry point: 0xffffffff81105f90 ...
Bootloader: Done loading app on coremask: 0xfff
Linux version 2.6.32.13-Cavium-Octeon (hello@xyz.zzz) (gcc version 4.3.3 (Cavium Networks Development Version) ) #251 SMP Tue Jun 15 17:27:41 PDT 2010
CVMSEG size: 2 cache lines (256 bytes)
bootconsole [early0] enabled
CPU revision is: 000d0409 (Cavium Octeon+)
Checking for the multiply/shift bug... no.
Checking for the daddiu bug... no.
Determined physical RAM map:
 memory: 00000000002e0000 @ 00000000081d0000 (usable after init)
 memory: 000000000004a000 @ 0000000001a46000 (usable)
 memory: 0000000006400000 @ 0000000001b00000 (usable)
 memory: 0000000007800000 @ 0000000008500000 (usable)
 memory: 0000000032400000 @ 0000000020000000 (usable)
Wasting 376656 bytes for tracking 6726 unused pages
Initial ramdisk at: 0xa8000000081d0000 (3014656 bytes)
Zone PFN ranges:
  Normal   0x00001a46 -> 0x00052400
Movable zone start PFN for each node
early_node_map[5] active PFN ranges
.
.
. 

20. Octeon Performance Counters

The Octeon performance counters are available through /proc/octeon_perf. Through this interface you can control the two performance counters per core and the four performance counters available in the L2 controller. Here is some sample output from /proc/octeon/perf

    octeon:~# cat /proc/octeon_perf
                     sissue           dissue
    CPU 0:        211717427        180982164
    CPU 1:        211487633        180958893
    CPU 2:        211465061        180883863
    CPU 3:        211412737        180907337
    CPU 4:        211456437        180956208
    CPU 5:        211416750        180913876
    CPU 6:        211500423        180923920
    CPU 7:        211452287        180942579
    CPU 8:        211397832        180903745
    CPU 9:        211441735        180934627
    CPU10:        211487240        180979999
    CPU11:        211439250        180933492
    CPU12:        211441315        180941196
    CPU13:        211435641        180931735
    CPU14:        211469154        180958894

    imiss: 3634
    ihit: 98044
    dmiss: 7570
    dhit: 786474

    Configuration of the performance counters is controller by writing
    one of the following values to:
        /sys/module/perf_counters/parameters/counter{0,1}
        /sys/module/perf_counters/parameters/l2counter{0-3}

    Possible CPU counters:
        none clk issue ret nissue sissue dissue ifi
        br brmis j jmis replay iuna trap
        uuload uustore uload ustore ec mc cc csrc
        cfetch cpref ica ii ip cimiss
        wbuf wdat wbufld wbuffl wbuftr badd baddl2 bfill
        ddids idids didna lds lmlds iolds dmlds
        sts lmsts iosts iobdma dtlb dtlbad itlb
        sync synciob syncw

    Possible L2 counters:
        cycles imiss ihit dmiss
        dhit miss hit victim-buffer-hit
        lfb-nq-index-conflict tag-probe tag-update tag-probe-completed
        tag-dirty-victim data-store-nop data-store-read data-store-write
        memory-fill-data-valid memory-write-request memory-read-request memory-write-data-valid
        xmc-nop xmc-ldt xmc-ldi xmc-ldd
        xmc-stf xmc-stt xmc-stp xmc-stc
        xmc-dwb xmc-pl2 xmc-psl1 xmc-iobld
        xmc-iobst xmc-iobdma xmc-iobrsp xmd-bus-valid
        xmd-bus-valid-dst-l2c xmd-bus-valid-dst-iob xmd-bus-valid-dst-pp rsc-nop
        rsc-stdn rsc-fill rsc-refl rsc-stin
        rsc-scin rsc-scfl rsc-scdn rsd-data-valid
        rsd-data-valid-fill rsd-data-valid-strsp rsd-data-valid-refl lrf-req
        dt-rd-alloc dt-wr-inva
    Warning: Counter configuration doesn't update till you access /proc/octeon_perf.
    

NOTE: The L2 performance counter events are different in OCTEON II chips. See below for a list of supported events.

Configuration of the performance counters is controller by writing the desired counter name to the /sys parameter for each counter. The possible counter types are:

Counter Control Files

  • /sys/module/perf_counters/parameters/counter{0,1}
  • /sys/module/perf_counters/parameters/l2counter{0-3}

    Core Counters
    - none - Turn off the performance counter
    - clk - Conditionally clocked cycles (as opposed to count/cvm_count which count even with no clocks)
    - issue - Instructions issued but not retired
    - ret - Instructions retired
    - nissue - Cycles no issue
    - sissue - Cycles single issue
    - dissue - Cycles dual issue
    - ifi - Cycle ifetch issued (but not necessarily commit to pp_mem)
    - br - Branches retired
    - brmis - Branch mispredicts
    - j - Jumps retired
    - jmis - Jumps mispredicted
    - replay - Mem Replays
    - iuna - Cycles idle due to unaligned_replays
    - trap - trap_6a signal
    - uuload - Unexpected unaligned loads (REPUN=1)
    - uustore - Unexpected unaligned store (REPUN=1)
    - uload - Unaligned loads (REPUN=1 or USEUN=1)
    - ustore - Unaligned store (REPUN=1 or USEUN=1)
    - ec - Exec clocks(must set CvmCtl[DISCE] for accurate timing)
    - mc - Mul clocks(must set CvmCtl[DISCE] for accurate timing)
    - cc - Crypto clocks(must set CvmCtl[DISCE] for accurate timing)
    - csrc - Issue_csr clocks(must set CvmCtl[DISCE] for accurate timing)
    - cfetch - Icache committed fetches (demand+prefetch)
    - cpref - Icache committed prefetches
    - ica - Icache aliases
    - ii - Icache invalidates
    - ip - Icache parity error
    - cimiss - Cycles idle due to imiss (must set CvmCtl[DISCE] for accurate timing)
    - wbuf - Number of write buffer entries created
    - wdat - Number of write buffer data cycles used (may need to set CvmCtl[DISCE] for accurate counts)
    - wbufld - Number of write buffer entries forced out by loads
    - wbuffl - Number of cycles that there was no available write buffer entry (may need to set CvmCtl[DISCE] and CvmMemCtl[MCLK] for accurate counts)
    - wbuftr - Number of stores that found no available write buffer entries
    - badd - Number of address bus cycles used (may need to set CvmCtl[DISCE] for accurate counts)
    - baddl2 - Number of address bus cycles not reflected (i.e. destined for L2) (may need to set CvmCtl[DISCE] for accurate counts)
    - bfill - Number of fill bus cycles used (may need to set CvmCtl[DISCE] for accurate counts)
    - ddids - Number of Dstream DIDs created
    - idids - Number of Istream DIDs created
    - didna - Number of cycles that no DIDs were available (may need to set CvmCtl[DISCE] and CvmMemCtl[MCLK] for accurate counts)
    - lds - Number of load issues
    - lmlds - Number of local memory load
    - iolds - Number of I/O load issues
    - dmlds - Number of loads that were not prefetches and missed in the cache
    - sts - Number of store issues
    - lmsts - Number of local memory store issues
    - iosts - Number of I/O store issues
    - iobdma - Number of IOBDMAs
    - dtlb - Number of dstream TLB refill, invalid, or modified exceptions
    - dtlbad - Number of dstream TLB address errors
    - itlb - Number of istream TLB refill, invalid, or address error exceptions
    - sync - Number of SYNC stall cycles (may need to set CvmCtl[DISCE] for accurate counts)
    - synciob - Number of SYNCIOBDMA stall cycles (may need to set CvmCtl[DISCE] for accurate counts)
    - syncw - Number of SYNCWs

    Added in OCTEON II

    - eretmis - D/eret mispredicts
    - likmis - Branch likely mispredicts
    - hazard-trap - Hazard traps due to *MTC0 to CvmCtl, Perf counter control, EntryHi, or CvmMemCtl registers

    L2 Counters
    - cycles - Cycles
    - imiss - L2 Instruction Miss
    - ihit - L2 Instruction Hit
    - dmiss - L2 Data Miss
    - dhit - L2 Data Hit
    - miss - L2 Miss (I/D)
    - hit - L2 Hit (I/D)
    - victim-buffer-hit - L2 Victim Buffer Hit (Retry Probe)
    - lfb-nq-index-conflict - LFB-NQ Index Conflict
    - tag-probe - L2 Tag Probe (issued - could be VB-Retried)
    - tag-update - L2 Tag Update (completed). Note: Some CMD types do not update
    - tag-probe-completed - L2 Tag Probe Completed (beyond VB-RTY window)
    - tag-dirty-victim - L2 Tag Dirty Victim
    - data-store-nop - L2 Data Store NOP
    - data-store-read - L2 Data Store READ
    - data-store-write - L2 Data Store WRITE
    - memory-fill-data-valid - Memory Fill Data valid
    - memory-write-request - Memory Write Request
    - memory-read-request - Memory Read Request
    - memory-write-data-valid - Memory Write Data valid
    - xmc-nop - XMC NOP
    - xmc-ldt - XMC LDT
    - xmc-ldi - XMC LDI
    - xmc-ldd - XMC LDD
    - xmc-stf - XMC STF
    - xmc-stt - XMC STT
    - xmc-stp - XMC STP
    - xmc-stc - XMC STC
    - xmc-dwb - XMC DWB
    - xmc-pl2 - XMC PL2
    - xmc-psl1 - XMC PSL1
    - xmc-iobld - XMC IOBLD
    - xmc-iobst - XMC IOBST
    - xmc-iobdma - XMC IOBDMA
    - xmc-iobrsp - XMC IOBRSP
    - xmd-bus-valid - XMD Bus valid (all)
    - xmd-bus-valid-dst-l2c - XMD Bus valid (DST=L2C) Memory
    - xmd-bus-valid-dst-iob - XMD Bus valid (DST=IOB) REFL Data
    - xmd-bus-valid-dst-pp - XMD Bus valid (DST=PP) IOBRSP Data
    - rsc-nop - RSC NOP
    - rsc-stdn - RSC STDN
    - rsc-fill - RSC FILL
    - rsc-refl - RSC REFL
    - rsc-stin - RSC STIN
    - rsc-scin - RSC SCIN
    - rsc-scfl - RSC SCFL
    - rsc-scdn - RSC SCDN
    - rsd-data-valid - RSD Data Valid
    - rsd-data-valid-fill - RSD Data Valid (FILL)
    - rsd-data-valid-strsp - RSD Data Valid (STRSP)
    - rsd-data-valid-refl - RSD Data Valid (REFL)
    - lrf-req - LRF-REQ (LFB-NQ)
    - dt-rd-alloc - DT RD-ALLOC
    - dt-wr-inva - DT WR-INVA

    L2 Counters in OCTEON II chips

    - none - None
    - hit - L2 Tag Hit
    - miss - L2 Tag Miss
    - no-alloc - L2 Tag NoAlloc (forced no-allocate)
    - victim - L2 Tag Victim
    - sc-fail - SC Fail
    - sc-pass - SC Pass
    - lfb-valid - LFB Occupancy (each cycle adds \# of LFBs valid)
    - lfb-wait-lfb - LFB Wait LFB (each cycle adds \# of LFBs waiting for other LFBs)
    - lfb-wait-vab - LFB Wait VAB (each cycle adds \# of LFBs waiting for VAB)
    - quad0-index - Quad 0 index bus inuse
    - quad0-read - Quad 0 read data bus inuse
    - quad0-bank - Quad 0 \# banks inuse (0-4/cycle)
    - quad0-wdat - Quad 0 wdat flops inuse (0-4/cycle)
    - quad1-index - Quad 1 index bus inuse
    - quad1-read - Quad 1 read data bus inuse
    - quad1-bank - Quad 1 \# banks inuse (0-4/cycle)
    - quad1-wdat - Quad 1 wdat flops inuse (0-4/cycle)
    - quad2-index - Quad 2 index bus inuse
    - quad2-read - Quad 2 read data bus inuse
    - quad2-bank - Quad 2 \# banks inuse (0-4/cycle)
    - quad2-wdat - Quad 2 wdat flops inuse (0-4/cycle)
    - quad3-index - Quad 3 index bus inuse
    - quad3-read - Quad 3 read data bus inuse
    - quad3-bank - Quad 3 \# banks inuse (0-4/cycle)
    - quad3-wdat - Quad 3 wdat flops inuse (0-4/cycle)

    

After you change which counter that is displayed, you need to cat /proc/octeon_perf twice to make sure the new value is used. The first one show the results from the old counter and sets the new counter to zero. The second one displays the new counter.

Here is an example setting up the L2 control to count cache missies:

    octeon:~# echo imiss > /sys/module/perf_counters/parameters/l2counter0
    octeon:~# echo ihit > /sys/module/perf_counters/parameters/l2counter1
    octeon:~# echo dmiss > /sys/module/perf_counters/parameters/l2counter2
    octeon:~# echo dhit > /sys/module/perf_counters/parameters/l2counter3
    

21. Debugging the Kernel

The Octeon Multicore debugger mipsisa64-octeon-elf-gdb may be used to debug the Linux kernel. The standalone simple executive debugging extensions provide improved debugging over standard KGDB.

Three GDB state variables control support for multicore debugging. These variables are active-coresfocus, and step-all. The standard GDB set and show commands access these variables. A description of each variable's affect on the debugging environment is listed below:

  • active-cores: This is a comma separated list of cores under active control of the debugger. Cores in this list will stop if any other core hits a breakpoint. Cores not in this list will only stop if they hit a breakpoint. Note that although cores not in active-cores do not stop when other cores hit a breakpoint, they do suffer a performance hit. As a convenience, setting the active cores to an empty string with "set active-cores" is interpreted as setting all cores active.

    (Core#0-gdb) set active-cores 0,1
    (Core#0-gdb) show active-cores
    The cores stopped on execution of a breakpoint by another core is "0,1".
    (Core#0-gdb) set active-cores
    (Core#0-gdb) show active-cores
    The cores stopped on execution of a breakpoint by another core is
    "0,1,2,3,4,6,7,8,9,10,11,12,13,14,15".
    (Core#0-gdb)

  • focus: This is the index (0-15) of the core directly interacting with the debugger. Data operations (r/w memory, registers) are performed in the focus core's context. Only cores currently stopped in a debug exception may become the focus core. This means making a non-active core the focus generally requires a breakpoint or changing the active-cores to include it. The GDB prompt will show the current focus core. Note that the focus core may change if another core hits a breakpoint. The core to hit a breakpoint first automatically becomes the focus core.

    (Core#0-gdb) show focus
    The focus core of debugger operations is 0.
    (Core#0-gdb) set focus 5
    (Core#5-gdb) show focus
    The focus core of debugger operations is 5.
    (Core#5-gdb)

  • step-all: This GDB variable controls the cores action when the commands step, step instruction, and continue are executed. By default (step-all off) only the focus core performs the operations. Execution of other cores occurs only if step-all is on. Note that cores not in the active-cores list currently stopped will continue execution under any step command with step-all enabled.

    (Core#5-gdb) show step-all
    Step commands affect all cores is off.
    (Core#5-gdb) set step-all on
    (Core#5-gdb) show step-all
    Step commands affect all cores is on.
    (Core#5-gdb)

In order to debug the Linux kernel, the following steps are required:

-1 For the EBT58XX, you will need two serial port connections to the board. Connect the two serial ports by the LED display to the host PC using null modem cables. The first should already be hooked up for use with Minicom as a console. The second will be used for the debugger traffic.

-2 Build the kernel with debugging support.

        $ cd linux
        $ make -s [sim|kernel]-config
        $ cd kernel_2.6/linux
        $ make menuconfig
        Machine selection  --->
            < > Octeon watchdog driver
        Kernel hacking  --->
            [ ] Remote GDB kernel debugging
            [*] Remote GDB debugging using the Cavium Networks Multicore GDB
        $ cd ../..
        $ make -s [sim|kernel]
        

-3 Start the Linux kernel.

Octeon Simulator

        $ cd kernel_2.6/linux
        $ ./oct-linux -quiet -noperf -uart1=2021
        terminal 2 $ telnet localhost 2020
        

EBT58XX Hardware Reference Board

Copy the Linux kernel to a compact flash.

        $ mkdir -p /mnt/usb
        $ fdisk -l /dev/sda         # Only needed on some Kernel 2.6 systems
        $ mount /dev/sda1 /mnt/usb
        $ mips64-octeon-linux-gnu-strip -o /mnt/usb/vmlinux.64 kernel_2.6/linux/vmlinux.64
        $ umount /mnt/usb
        

Put the compact flash into the EBT58XX and reset the board. At the bootloader prompt load linux.

        Octeon ebt5800# fatload ide 0 $(loadaddr) vmlinux.64
        Octeon ebt5800# bootoctlinux $(fileaddr)
        

-4 Startup and connect the debugger

Octeon Simulator

        $ cd linux/kernel_2.6/linux
        $ mipsisa64-octeon-elf-ddd --debugger mipsisa64-octeon-elf-gdb vmlinux.64
        gdb> target octeon tcp::2021
        gdb> load
        gdb> stepi
        

EBT58XX Hardware Reference Board

        $ cd linux/kernel_2.6/linux
        $ mipsisa64-octeon-elf-ddd --debugger mipsisa64-octeon-elf-gdb vmlinux.64
        gdb> target octeon /dev/ttyS1
        gdb> load
        gdb> stepi
        

or with GDB:

	/usr/local/Cavium_Networks/OCTEON-SDK mipsisa64-octeon-elf-gdb -q linux/kernel_2.6/linux/vmlinux.64
	(Core#0-gdb) target octeon /dev/ttyS0
	Remote target octeon connected to /dev/ttyS0
	(Core#0-gdb) b r4k_wait
	Breakpoint 1 at 0xffffffff8110a434: file arch/mips/kernel/cpu-probe.c, line 51.
	(Core#0-gdb) c
	Continuing.
	Breakpoint 1, r4k_wait () at arch/mips/kernel/cpu-probe.c:51
	51__asm__(".set\tmips3\n\t"
	(Core#0-gdb)
	

Control-C interrupt handling is handled in the kernel inside do_IRQ in arch/mips/kernel/irq.c. Whenever interrupt 2 occurs, the debug uart is polled. If there is any data available, the MCD0 signal is pulsed. This causes all cores to enter the interrupt handler. A side effect of this is that any Control-C interrupt, or initial debug entry will always stop in do_IRQ under exception context.

Commonly the kernel uses smp_call_function to synchronize operations across all cores. This will hang forever if a core is currently stopped in the debug exception. Make sure step-all is on whenever SMP operations are expected.

22. Debugging the Kernel with KGDB

The standard Linux kernel debugger, KGDB, is also available on Octeon. The setup for KGDB is very similar to the Cavium Networks Multicore GDB. Here are the steps:

-1 For the EBT58XX, you will need two serial port connections to the board. Connect the two serial ports by the LED display to the host PC using null modem cables. The first should already be hooked up for use with Minicom as a console. The second will be used for the debugger traffic. Optionally, a single uart can be used if "Console output to GDB" is enabled. This is not recommended since it breaks output for userspace applications.

-2 Build the kernel with debugging support and without the octeon watchdog driver.

        $ cd linux
        $ make -s [sim|kernel]-config
        $ cd kernel_2.6/linux
        $ make menuconfig
        Machine selection  --->
            < > Octeon watchdog driver
        Kernel hacking  --->
            [*] KGDB: kernel debugging with remote gdb  --->
                  <*>   8250/16550 and compatible serial support

        $ cd ../..
        $ make -s [sim|kernel]
        

-3 Start the Linux kernel.

Octeon Simulator

        $ cd linux/kernel_2.6
        $ ./oct-linux -quiet -noperf -uart1=2021
        terminal 2 $ telnet localhost 2020
        

EBT58XX Hardware Reference Board

Copy the Linux kernel to a compact flash.

        $ mkdir -p /mnt/usb
        $ fdisk -l /dev/sda         # Only needed on some Kernel 2.6 systems
        $ mount /dev/sda1 /mnt/usb
        $ mips64-octeon-linux-gnu-strip -o /mnt/usb/vmlinux.64 kernel_2.6/linux/vmlinux.64
        $ umount /mnt/usb
        

Put the compact flash into the EBT58XX and reset the board. At the bootloader prompt load linux.

        Octeon ebt5800# fatload ide 0 $(loadaddr) vmlinux.64
	Octeon ebt5800# bootoctlinux $(fileaddr) kgdbwait kgdboc=ttyS1,38400
        

-4 Startup and connect the debugger

Octeon Simulator

        $ cd linux/kernel_2.6/linux
        $ mips64-octeon-linux-gnu-gdb vmlinux.64
        (gdb) target remote tcp::2021
        (gdb) stepi
        

EBT58XX Hardware Reference Board

        $ cd linux/kernel_2.6/linux
        $ mips64-octeon-linux-gnu-gdb vmlinux.64
        (gdb) set remotebaud 38400
        (gdb) target remote /dev/ttyS1
        (gdb) stepi
        

23. Booting Two Separate Kernels on an EBT58XX

Octeon Linux can be configured to run two separate kernels with each kernel using a separate uart for the console. Below is the procedure to boot one kernel on all the even cores and another on all the odd cores.

-1 Build a Linux kernel.

        $ cd ${OCTEON_ROOT}/linux
        $ make -s kernel strip
        

-2 Transfer this kernel to the compact flash.

        $ mount /mnt/cf1
        $ cp kernel_2.6/linux/vmlinux.64 /mnt/cf1/vmlinux.64_1
        $ umount /mnt/cf1
        

-3 Configure the kernel to build the second image.

       make menuconfig
           Machine selection  --->
	   [*]   Build the kernel to be used as a 2nd kernel on the same chip
       

-4 Rebuild the kernel.

        $ make -s kernel strip
        

-5 Transfer this kernel to the compact flash.

        $ mount /mnt/cf1
        $ cp kernel_2.6/linux/vmlinux.64 /mnt/cf1/vmlinux.64_2
        $ umount /mnt/cf1
        

-6 Put the compact flash into the EBT58XX and boot the kernels.

        Octeon ebt5800# fatload ide 0 $(loadaddr) vmlinux.64_1
        Octeon ebt5800# bootoctl $(fileaddr) coremask=aaaa
        Octeon ebt5800# fatload ide 0 $(loadaddr) vmlinux.64_2
        Octeon ebt5800# bootoctl $(fileaddr) coremask=5555
        

-7 You should get kernel boot messages out of both uarts.

24. Ethernet between two booted kernels using the POW

When running multiple kernels on Octeon, ethernet traffic between the kernels gets more complicated than with a normal ethernet driver. Only one kernel is allowed to run the Cavium ethernet driver. The other kernels communicate ethernet traffic using the octeon-pow-ethernet device driver. This driver uses the Octeon POW to route network traffic between different kernels using a group per kernel. The octeon-pow-ethernet module takes two parameters:

  • receive_group: 0-15 POW group to receive packets from. This must be unique in the system. If you don't specify a value, the hardware core ID will be used of the CPU that loads the module. It is strongly advised that you always use "schedtool -a 0 -e modprobe octeon-pow-ethernet ..." to load the module. This way the default value of this parameter is deterministic.

  • broadcast_groups: Bitmask of groups to send broadcasts to. This must be specified. Be careful to not send broadcasts to groups that aren't read otherwise you may fill the POW and stop receiving packets. The value of this parameter is exceedingly important. It should have a bit set for the first core of each kernel that is running. By "first core", I mean the core that loads the octeon-pow-ethernet driver. If you followed the suggestion under receive_group, this should be set to virtual core zero under each kernel by schedtool. An error in specifying this parameter may cause some kernels to not be able to communicate or loss of packet buffers to unread POW groups.

In order to use this driver you must:

  1. Enable the driver under menuconfig
           make menuconfig
               Device Drivers --->
                   Networking Device Support --->
                       Ethernet (1000 Mbit) --->
    			<M> POW based internal only ethernet driver
           
  2. Under kernel one, modprobe the octeon-pow-ethernet device driver. It is very important that the broadcast_groups be set correctly. If the parameter isn't set properly, you will leak packet buffers into unused POW groups, causing a lockup of networking.
     		$ schedtool -a 0 -e modprobe octeon-pow-ethernet broadcast_groups=3
           
  3. Under kernel two, modprobe the octeon-pow-ethernet device driver.
     		$ schedtool -a 0 -e modprobe octeon-pow-ethernet broadcast_groups=3
           
  4. If desired, setup kernel one to bridge traffic between the main network and the POW network.
     		$ brctl addbr br0
    		$ brctl addif br0 eth0
    		$ brctl addif br0 oct0
    		$ ifconfig eth0 up promisc
    		$ ifconfig oct0 up promisc
    		$ ifconfig br0 up
    		Wait for the bridge to begin forwarding.
           
  5. Either use ifconfig or dhcp to assign IP addresses to kernel one.
    	    $ ifconfig br0 10.0.0.1
    		or
               $ udhcpc -i br0
           
  6. Either use ifconfig or dhcp to assign IP addresses to kernel two.
    	    $ ifconfig oct0 10.0.0.2
    		or
               $ udhcpc -i oct0
           

25. Octeon Watchdog Driver

Octeon Linux includes a watchdog driver for monitoring all cores running Linux. The watchdog driver is controlled by the kernel config option CONFIG_CAVIUM_OCTEON_WATCHDOG. The driver consists of the two files octeon-wdt.c and octeon-wdt-nmi.S located under the "linux/arch/mips/cavium-octeon" directory. The driver supports two parameters: heartbeat=s the watchdog timeout in seconds. nowayout=x keep the watchdog active even if userspace monitoring app dies. Here is an overview of the processing of watchdogs.

  1. Every core that is "online" under Linux is configured to receive the interrupt for the watchdog with its same number.
  2. Each of the watchdogs is configured to generate an interrupt, followed by a NMI, followed finally by a chip soft reset. Each progression occurs every "timeout".
  3. The interrupt handler pokes the watchdog for the associated core reseting it to the first state.
  4. If for some reason the interrupt handler doesn't poke the watchdog, a NMI is sent to the core after another timeout.
  5. The NMI handler prints a message to UART0 and then sits in a spin loop until chip reset.
  6. Since the NMI doesn't stop the watchdog, it will perform a chip wide soft reset after another timeout.
  7. If a userspace application opens /dev/watchdog, the driver quits poking the watchdog from the interrupt handler. The poking is instead done when a userspace application writes to the device.

When programming the watchdogs, keep these notes in mind:

  1. A soft reset does not stop the watchdog counters. It is possible to continue to get NMI and soft resets after a chip reboot. For this reason the Linux kernel disables the watchdogs before rebooting the system with a soft reset.
  2. Once a watchdog has expired it must be poked before it can operate again. It is not sufficient to disable/enable the watchdog. It is advised that you always poke the watchdog once during watchdog initialization.
  3. Bit 44 of CIU_INTX_SUM0 is the logical OR of all watchdog signals enabled in CIU_INTX_EN1. Since CIU_INTX_EN0 bit 44 isn't implemented, you must mask watchdog interrupts using CIU_INTX_EN1 instead of the usual CIU_INTX_EN0.
  4. The bootloader reset vector and the NMI interrupt handler are at the same location in flash. In order to install a NMI handler you must use one of the bootbus moveable regions to shadow the reset vector. Also keep in mind that the NMI vector is only 128 bytes.
  5. It is not possible to program the watchdogs to cause a soft reset without a NMI. This means you almost always require a NMI handler.
  6. In many cases, interrupts can continue to be processed even though the system appears "dead". For example, after a halt of the kernel using the "poweroff" command, the kernel continues to process interrupts. This means the watchdogs may continue to be poked even though the system is unusable. However, if a userspace application has opened /dev/watchdog, the watchdogs are no longer poked in the interrupt handlers, and the system will reset.

26. Octeon Specific Changes in Userspace

  1. Enable userspace access to XKPHYS addresses only for Octeon hardware IO addresses.
  2. GLIBC patches for abi=n32 and abi=n64.
  3. /proc/octeon_info contains Octeon board specific information.
  4. /proc/octeon_perf contains Octeon specific performance counters.

27. Octeon Specific Kernel Config Options

  • CONFIG_CAVIUM_OCTEON2: This option enables the generation of OCTEON II specific instructions by the compiler, resulting in a kernel that is more efficient, but that will not run on OCTEON and OCTEON Plus processor cores.

  • CONFIG_CAVIUM_OCTEON_CHK_CVMX_PARAMETER: Compile the kernel with CVMX parameter checking enabled. This might catch some programming errors, but will result in a slower kernel.

  • CONFIG_CAVIUM_OCTEON_CHK_CVMX_ADDRESS: Compile the kernel with CVMX CSR address checking enabled. This might catch some programming errors, but will result in a slower kernel.

  • CONFIG_CAVIUM_OCTEON_CHK_CVMX_POW: Compile the kernel with CVMX CSR POW checking enabled. This might catch some programming errors, but will result in a slower kernel.

  • CONFIG_CAVIUM_OCTEON_2ND_KERNEL: This option configures this kernel to be linked at a different address and use the 2nd uart for output. This allows a kernel built with this option to be run at the same time as one built without this option. Also see section 2 for additional information.

  • CONFIG_CAVIUM_OCTEON_HW_FIX_UNALIGNED: Configure the Octeon hardware to automatically fix unaligned loads and stores. Normally unaligned accesses are fixed using a kernel exception handler. This option enables the hardware automatic fixups, which requires only an extra 3 cycles. Disable this option if you are running code that relies on address exceptions on unaligned accesses.

  • CONFIG_FAST_ACCESS_TO_THREAD_POINTER: For Mips, normally the TLS thread pointer is accessed by the userspace program executing a "rdhwr" from register $29. This register doesn't exist, so the kernel emulates the instruction assigning the thread pointer to the value register. This option supplies an alternate, faster access to the thread pointer. A side effect of this option is that the highest 8 bytes of CVMSEG is used by the kernel to save and restore the thread pointer during the TLB fault handlers. This CVMSEG address isn't available to user applications.

  • CONFIG_REPLACE_EMULATED_ACCESS_TO_THREAD_POINTER: When this option is set, the kernel can dynamically replace slower references to the thread pointer with fast accesses. This involves replacing userspace instructions at runtime, so it may not work with all programs. It is advised to use a toolchain that creates code for FAST_ACCESS_TO_THREAD_POINTER instead of this option. If you have code compiled with a Cavium compiler prior to release 1.5, or are using a non Cavium compiler, this option may allow you to receive most of the benefit of direct access to the thread pointer. It may also cause programs to fail. Instruction replacement is disabled on boot. It can be controlled by writing a mode to /sys/module/traps/parameters/thread_pointer_mode. The supported modes are: 0 - Use the normal kernel emulation without any changes. 1 - Replace emulated instructions with direct accesses to the thread register. 2 - Replace emulated instructions and log the replacement PC. 3 - Replace emulated instructions with break instructions. This will cause programs to fail, but makes it easy to stop gdb on the instruction.

  • CONFIG_TEMPORARY_SCRATCHPAD_FOR_KERNEL: For Mips, performance-critical kernel routines (like the TLB miss handlers) can normally only use registers K0 and K1 ($26 and $27) from the main register file. This option allocates space in CVMSEG LM for the same function. This can make the kernel routines run faster. A side effect of this is that the kernel will trash the CVMSEG LM locations, which are placed as high as possible in CVMSEG LM space, but below the space allocated for the FAST_ACCESS_TO_THREAD_POINTER option. Like FAST_ACCESS_TO_THREAD_POINTER, these CVMSEG locations are not available to user applications.

  • CONFIG_CAVIUM_OCTEON_CVMSEG_SIZE: CVMSEG LM is a segment that accesses portions of the dcache as a local memory; the larger CVMSEG is, the smaller the cache is. This selects the size of CVMSEG LM, which is in cache blocks. The legally range is from zero to 54 cache blocks (i.e. CVMSEG LM is between zero and 6192 bytes). Also see section 8 for additional information.

  • CONFIG_CAVIUM_OCTEON_LOCK_L2: Enable locking parts of the kernel into the L2 cache.

  • CONFIG_CAVIUM_OCTEON_LOCK_L2_TLB: Lock the low level TLB fast path into L2.

  • CONFIG_CAVIUM_OCTEON_LOCK_L2_EXCEPTION: Lock the low level exception handler into L2.

  • CONFIG_CAVIUM_OCTEON_LOCK_L2_LOW_LEVEL_INTERRUPT: Lock the low level interrupt handler into L2.

  • CONFIG_CAVIUM_OCTEON_LOCK_L2_INTERRUPT: Lock the 2nd level interrupt handler in L2.

  • CONFIG_CAVIUM_OCTEON_LOCK_L2_MEMCPY: Lock the kernel's implementation of memcpy() into L2.

  • CONFIG_CAVIUM_OCTEON_USER_IO_PER_PROCESS: Allows user applications to use XKPHYS addresses directly to access IO space. This option dynamically enable/disable with sysmips syscall, by a process with root privilege. Without root privilege you can only remove access.

  • CONFIG_CAVIUM_OCTEON_USER_MEM_PER_PROCESS: Allows user applications to use XKPHYS addresses directly to memory. This option dynamically enable/disable with sysmips syscall, Without root privilege you can only remove access.

  • CONFIG_CAVIUM_OCTEON_USER_IO: Allows all user applications to directly access the Octeon hardware IO addresses (0x1000000000000 - 0x1ffffffffffff). This allows high performance networking applications to run in user space with minimal performance penalties. This also means a user application can bring down the entire system. Only use this option on embedded devices where all user applications are strictly controlled.

  • CONFIG_CAVIUM_OCTEON_USER_MEM: Allows all user applications to use XKPHYS addresses directly to memory. This allows user space direct access to shared memory not in use by Linux. This memory is suitable for use with the Octeon hardware. Cavium simple executive applications also share this memory. Since this bypass all of the Linux memory protection, only use this option on embedded devices where all user applications are strictly controlled.

  • CONFIG_CAVIUM_RESERVE32: Reserve a shared memory region for user processes to use for hardware memory buffers. This is required for 32bit applications to be able to send and receive packets directly. Applications access this memory by memory mapping /dev/mem for the addresses in /proc/octeon_info. When this option is configured the octeon-ethernet driver will allocate its buffers from this shared memory region.

  • CONFIG_CAVIUM_OCTEON_WATCHDOG: This option enables a watchdog driver for all cores running Linux. It installs a NMI handler and pokes the watchdog based on an interrupt. On first expiration of the watchdog, the interrupt handler pokes it. The second expiration causes an NMI that prints a message and resets the chip. The third expiration causes a global soft reset. Also see section 19 for additional information.

  • CONFIG_CAVIUM_OCTEON_TRA: This option enables a driver for the Octeon trace buffer. By default it enables interrupts on some illegal memory accesses. See octeon-tra.c for information on customizing this driver to find specific problems.

  • CONFIG_OCTEON_ETHERNET: This driver supports the builtin ethernet ports on Cavium Networks' products in the Octeon family. This driver supports the CN3XXX, CN5XXX and CN6XXX Octeon processors.

  • CONFIG_OCTEON_NUM_PACKET_BUFFERS: Number of packet buffers (and work queue entries) to allocate for the ethernet driver. Zero is treated as 1024.

  • CONFIG_OCTEON_POW_ONLY_ETHERNET: This option enables a very simple ethernet driver for internal core to core traffic. It relies on another driver, octeon-ethernet, to perform all hardware setup. This driver's purpose is to supply basic networking between different Linux images running on the same chip. A single core loads the octeon-ethernet module, all other cores load this driver. On load, the driver waits for some other core to perform hardware setup.

  • CONFIG_OCTEON_MGMT_ETHERNET: This option enables the ethernet driver for the management port on Cavium Networks' Octeon CN57XX, CN56XX, CN55XX, CN54XX, CN52XX, and CN6XXX chips.

  • CONFIG_USB_OCTEON_HCD: The Octeon DWC_OTG USB host controller. All CN3XXX and CN5XXX based chips with USB are supported.

  • CONFIG_USB_OCTEON_EHCI: Enable support for the Octeon II SOC's on-chip EHCI controller. It is needed for high-speed (480Mbit/sec) USB 2.0 device support. All CN6XXX based chips with USB are supported.

  • CONFIG_USB_OCTEON_OHCI: Enable support for the Octeon II SOC's on-chip OHCI controller. It is needed for low-speed USB 1.0 device support. All CN6XXX based chips with USB are supported.

  • CONFIG_CAVIUM_OCTEON_RAPIDIO: Connect the SRIO interfaces available in the Octeon II series of processors to the kernel's RapidIO subsystem. The existence of the SRIO ports is automatically detected and configured as either a host or device. Bus enumeration will be performed on host interfaces as appropriate. After configuring this option, you will likely want to enable the RapidIO network adapter under the devices menu.

28. OCTEON II specific kernel considerations.

The default configuration of the Octeon Linux kernel will run on all Octeon chips. However, the OCTEON II processor (CN6XXX) has several new machine instructions available that allow for more efficient code that cannot be used in a Linux kernel intended for general purpose use on the entire Octeon family. If it is known that the kernel will be run only on Octeon II processors, several optimizations can be enabled, yielding an increase in performance.

Many of the most performance critical portions of the kernel automatically select the most efficient machine instructions for the actual processor being used. OCTEON II specific instructions are already used in these places even with the default configuration.

  • CONFIG_CAVIUM_OCTEON2: As mentioned above, this option enables the generation of OCTEON II specific instructions by the compiler, resulting in a kernel that is more efficient, but that will not run on OCTEON and OCTEON Plus processor cores. Note that the use of this option has no affect on the ability of the kernel to run Octeon and OcteonPlus programs.

  • CONFIG_USB_OCTEON_HCD: Because the Octeon II has EHCI/OHCI compatible USB hardware, there is no need to include the DWC_OTG USB driver. Disabling this driver will reduce the size of the kernel, thus saving memory.

29. Kernel tracing with ftrace

The Linux kernel has a built-in tracing system that allows many aspects of the kernel to be recorded and then played back for analysis. This is called the ftracesystem.

The tracing infrastructure is disabled by default, because even when inactive, it adds a small overhead to many kernel operations. It can be enabled from menuconfig:

    Kernel hacking  --->
        [*] Debug Filesystem
        [*] Compile the kernel with debug info
        [*] Tracers  --->
            [*]   Kernel Function Tracer
            [*]     Kernel Function Graph Tracer
            [*]   Interrupts-off Latency Tracer
            [*]   Scheduling Latency Tracer
            [*]   Trace SLAB allocations
            [*]   Trace workqueues
            [*]   Support for tracing block io actions
            [*]   enable/disable ftrace tracepoints dynamically
    

Once the tracing enabled kernel is running, the tracing can be controlled as described in the kernel's tracing documentation in thelinux/kernel_2.6/linux/Documentation/trace directory. The ftrace.txt file is a good place to start.

The embedded_rootfs contains the optional program trace-cmd which can be used to facilitate tracing. It has the ability to transmit trace data over a network port or save it to a file for off-line analysis. Please see the trace-cmd manual pages for more information.

trace-cmd and its graphical companion tool kernelshark can also be used on a development workstation to analyze trace data obtained from the Octeon Linux target board.

30. Address Space Randomization

In order to make the system more secure, the addresses of program stacks, shared libraries and heap are randomized. This is normally desired, however, it can make debugging a program more difficult because each run of the program will have a different address space layout.

The randomization may be disable at boot time by passing norandmaps on the kernel command line. It can also be controlled at runtime.

To diable randomization:

~ # echo 0 > /proc/sys/kernel/randomize_va_space
    

To enable randomization:

~ # echo 2 > /proc/sys/kernel/randomize_va_space
    

31. Sample Boot Log

Here is a sample boot of Linux with 16 cores on an EBT58XX. The kernel is loaded through TFP from a host at 192.168.162.57. DHCP is used to get the network IP address.
U-Boot 1.1.1 (U-boot build #: 235) (SDK version: 1.9.0-312) (Build time: Apr 23
2009 - 20:07:18)

EBT5800 board revision major:2, minor:0, serial #: 2009-2.0-00438
OCTEON CN5860-NSP pass 2.3, Core clock: 800 MHz, DDR clock: 399 MHz (798 Mhz data rate)
DRAM:  2048 MB
Clearing DRAM........ done
Flash:  8 MB
BIST check passed.
Net:   octeth0, octeth1, octeth2, octeth3
 Bus 0 (CF Card): not available

Octeon ebt5800# dhcp
Interface 1 has 4 ports (RGMII)
Interface 2 has 4 ports (NPI)
BOOTP broadcast 1
octeth0: Up 100 Mbps Full duplex (port 16)
DHCP client bound to address 192.168.162.98
Octeon ebt5800# setenv serverip 192.168.162.57
Octeon ebt5800# tftpboot $(loadaddr) vmlinux.64
Using octeth0 device
TFTP from server 192.168.162.57; our IP address is 192.168.162.98
Filename 'vmlinux.64'.
Load address: 0x20000000
Loading: #################################################################
	 #########################################################
done
Bytes transferred = 17434200 (10a0658 hex), 5934 Kbytes/sec
Octeon ebt5800# bootoctlinux $(loadaddr) numcores=$(numcores) endbootargs mtdparts=phys_mapped_flash:512k(bootloader)ro,2560k(kernel),4096k(cramfs),960k(jffs2),64k(bootloader_env)ro mem=0
argv[2]: numcores=16
argv[3]: endbootargs
ELF file is 64 bit
Attempting to allocate memory for ELF segment: addr: 0xffffffff81100000 (adjusted to: 0x0000000001100000), size 0x17b0f00
Allocated memory for ELF segment: addr: 0xffffffff81100000, size 0x17b0f00
Processing PHDR 0
  Loading 175bc80 bytes at ffffffff81100000
  Clearing 55280 bytes at ffffffff8285bc80
## Loading Linux kernel with entry point: 0xffffffff81105f90 ...
Bootloader: Done loading app on coremask: 0xffff
Linux version 2.6.32.10-Cavium-Octeon (testing@sw-build.caveonetworks.com) (gcc
version 4.3.3 (Cavium Networks Development Version) ) #2 SMP Tue Apr 20 22:45:41 PDT 2010
CVMSEG size: 2 cache lines (256 bytes)
bootconsole [early0] enabled
CPU revision is: 000d030b (Cavium Octeon+)
Checking for the multiply/shift bug... no.
Checking for the daddiu bug... no.
Determined physical RAM map:
 memory: 00000000011f2000 @ 000000000166e000 (usable)
 memory: 000000000d400000 @ 0000000002900000 (usable)
 memory: 0000000060000000 @ 0000000020000000 (usable)
 memory: 0000000010000000 @ 0000000410000000 (usable)
Wasting 321552 bytes for tracking 5742 unused pages
Initrd not found or empty - disabling initrd
Zone PFN ranges:
  Normal   0x0000166e -> 0x00420000
Movable zone start PFN for each node
early_node_map[4] active PFN ranges
    0: 0x0000166e -> 0x00002860
    0: 0x00002900 -> 0x0000fd00
    0: 0x00020000 -> 0x00080000
    0: 0x00410000 -> 0x00420000
PERCPU: Embedded 10 pages/cpu @a8000000045a0000 s11392 r8192 d21376 u65536
pcpu-alloc: s11392 r8192 d21376 u65536 alloc=16*4096
pcpu-alloc: [0] 00 [0] 01 [0] 02 [0] 03 [0] 04 [0] 05 [0] 06 [0] 07
pcpu-alloc: [0] 08 [0] 09 [0] 10 [0] 11 [0] 12 [0] 13 [0] 14 [0] 15
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 458560
Kernel command line:  mtdparts=phys_mapped_flash:512k(bootloader)ro,2560k(kernel),4096k(cramfs),960k(jffs2),64k(bootloader_env)ro console=ttyS0,115200
PID hash table entries: 4096 (order: 3, 32768 bytes)
Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes)
Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes)
Primary instruction cache 32kB, virtually tagged, 4 way, 64 sets, linesize 128 bytes.
Primary data cache 16kB, 64-way, 2 sets, linesize 128 bytes.
Memory: 2018800k/2070472k available (4213k kernel code, 50912k reserved, 1344k data, 18376k init, 0k highmem)
Hierarchical RCU implementation.
NR_IRQS:216
Calibrating delay loop (skipped) preset value.. 1600.00 BogoMIPS (lpj=8000000)
Security Framework initialized
Mount-cache hash table entries: 256
Checking for the daddi bug... no.
SMP: Booting CPU01 (CoreId  1)...
CPU revision is: 000d030b (Cavium Octeon+)
SMP: Booting CPU02 (CoreId  2)...
CPU revision is: 000d030b (Cavium Octeon+)
SMP: Booting CPU03 (CoreId  3)...
CPU revision is: 000d030b (Cavium Octeon+)
SMP: Booting CPU04 (CoreId  4)...
CPU revision is: 000d030b (Cavium Octeon+)
SMP: Booting CPU05 (CoreId  5)...
CPU revision is: 000d030b (Cavium Octeon+)
SMP: Booting CPU06 (CoreId  6)...
CPU revision is: 000d030b (Cavium Octeon+)
SMP: Booting CPU07 (CoreId  7)...
CPU revision is: 000d030b (Cavium Octeon+)
SMP: Booting CPU08 (CoreId  8)...
CPU revision is: 000d030b (Cavium Octeon+)
SMP: Booting CPU09 (CoreId  9)...
CPU revision is: 000d030b (Cavium Octeon+)
SMP: Booting CPU10 (CoreId 10)...
CPU revision is: 000d030b (Cavium Octeon+)
SMP: Booting CPU11 (CoreId 11)...
CPU revision is: 000d030b (Cavium Octeon+)
SMP: Booting CPU12 (CoreId 12)...
CPU revision is: 000d030b (Cavium Octeon+)
SMP: Booting CPU13 (CoreId 13)...
CPU revision is: 000d030b (Cavium Octeon+)
SMP: Booting CPU14 (CoreId 14)...
CPU revision is: 000d030b (Cavium Octeon+)
SMP: Booting CPU15 (CoreId 15)...
CPU revision is: 000d030b (Cavium Octeon+)
Brought up 16 CPUs
NET: Registered protocol family 16
Not in host mode, PCI Controller not initialized
bio: create slab <bio-0> at 0
vgaarb: loaded
SCSI subsystem initialized
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
Switching to clocksource OCTEON_CVMCOUNT
NET: Registered protocol family 2
IP route cache hash table entries: 65536 (order: 7, 524288 bytes)
TCP established hash table entries: 262144 (order: 10, 4194304 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 262144 bind 65536)
TCP reno registered
NET: Registered protocol family 1
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
/proc/octeon_perf: Octeon performace counter interface loaded
Octeon watchdog driver loaded with a timeout of 5368 ms.
init_vdso successfull
HugeTLB registered 2 MB page size, pre-allocated 0 pages
JFFS2 version 2.2. (NAND) .. 2001-2006 Red Hat, Inc.
msgmni has been set to 3944
alg: No test for stdrng (krng)
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered (default)
Serial: 8250/16550 driver, 2 ports, IRQ sharing disabled
serial8250.0: ttyS0 at MMIO 0x1180000000800 (irq = 58) is a OCTEON
console [ttyS0] enabled, bootconsole disabled
console [ttyS0] enabled, bootconsole disabled
brd: module loaded
loop: module loaded
pata_octeon_cf pata_octeon_cf: version 2.1 8 bit.
scsi0 : pata_octeon_cf
ata1: PATA max PIO6 cmd 900000001d000800 ctl 900000001d00080e
mdio-octeon: probed
mdio-octeon mdio-octeon.0: Version 1.0
Intel(R) PRO/1000 Network Driver - version 7.3.21-k5-NAPI
Copyright (c) 1999-2006 Intel Corporation.
e1000e: Intel(R) PRO/1000 Network Driver - 1.0.2-k2
e1000e: Copyright (c) 1999-2008 Intel Corporation.
sky2 driver version 1.25
OcteonUSB: Detected 0 ports
Initializing USB Mass Storage driver...
usbcore: registered new interface driver usb-storage
USB Mass Storage support registered.
usbcore: registered new interface driver libusual
i2c /dev entries driver
i2c-octeon i2c-octeon.0: octeon_i2c_write: bad status before write (0x20)
rtc-ds1307: probe of 0-0068 failed with error -5
i2c-octeon i2c-octeon.0: version 2.0
md: linear personality registered for level -1
md: raid0 personality registered for level 0
md: raid1 personality registered for level 1
md: raid10 personality registered for level 10
md: multipath personality registered for level -4
md: faulty personality registered for level -5
device-mapper: ioctl: 4.15.0-ioctl (2009-04-01) initialised: dm-devel@redhat.comoprofile: using mips/octeon performance monitoring.
TCP cubic registered
NET: Registered protocol family 17
Bootbus flash: Setting flash for 8MB flash at 0x1f400000
phys_mapped_flash: Found 1 x16 devices at 0x0 in 8-bit bank
 Amd/Fujitsu Extended Query Table at 0x0040
phys_mapped_flash: Swapping erase regions for broken CFI table.
number of CFI chips: 1
cfi_cmdset_0002: Disabling erase-suspend-program due to code brokenness.
5 cmdlinepart partitions found on MTD device phys_mapped_flash
Creating 5 MTD partitions on "phys_mapped_flash":
0x000000000000-0x000000080000 : "bootloader"
0x000000080000-0x000000300000 : "kernel"
0x000000300000-0x000000700000 : "cramfs"
0x000000700000-0x0000007f0000 : "jffs2"
0x0000007f0000-0x000000800000 : "bootloader_env"
Freeing unused kernel memory: 18376k freed
/sbin/rc starting
Updating module dependencies
Loading IPv6 module
NET: Registered protocol family 10
Mounting file systems
Setting up loopback
Starting syslogd
Starting telnetd
Jan  1 00:00:01 (none) sJan  1 00:00:01 (none) daemon.info init: Starting pid 930, console /dev/ttyS0: '/bin/sh'


BusyBox v1.2.1 (2010.04.21-04:55+0000) Built-in shell (ash)
Enter 'help' for a list of built-in commands.

~ # modprobe octeon-ethernet
octeon-ethernet 2.0
Interface 1 has 4 ports (RGMII)
Interface 2 has 4 ports (NPI)
~ # eth0: 100 Mbps Full duplex, port 16, queue 16
udhcpc eth0
udhcpc (v1.2.1) started
Jan  1 00:01:53 (none) local0.info udhcpc[942]: udhcpc (v1.2.1) started
Sending discover...
Jan  1 00:01:53 (none) local0.debug udhcpc[942]: Sending discover...
Sending select for 192.168.162.150...
Jan  1 00:01:53 (none) local0.debug udhcpc[942]: Sending select for 192.168.162.150...
Lease of 192.168.162.150 obtained, lease time 172800
Jan  1 00:01:54 (none) local0.info udhcpc[942]: Lease of 192.168.162.150 obtained, lease time 172800
deleting routers
SIOCDELRT: No such process
adding dns 192.168.16.11
adding dns 192.168.16.10
~ # ping -c 3 www.google.com
PING www.l.google.com (74.125.19.147) 56(84) bytes of data.
64 bytes from nuq04s01-in-f147.1e100.net (74.125.19.147): icmp_seq=1 ttl=55 time=23.0 ms
64 bytes from nuq04s01-in-f147.1e100.net (74.125.19.147): icmp_seq=2 ttl=55 time=10.3 ms
64 bytes from nuq04s01-in-f147.1e100.net (74.125.19.147): icmp_seq=3 ttl=55 time=9.06 ms

--- www.l.google.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 9.069/14.149/23.039/6.307 ms

~ #
    

Generated on Wed Sep 8 16:23:58 2010 for Octeon Software Development Kit by  doxygen 1.4.2

    猜你喜欢
    发表评论
    推荐阅读 更多