Booting Octeon from NAND Flash

ebookman 2012-03-12

展开全文

Booting Octeon from NAND Flash

1. Introduction

Beginning with CN52XX pass 2, Octeon chips support booting from NAND flash instead of the normal NOR flash. NAND flash presents different technical issues than NOR flash that substantially change the boot process. This document describes the design and implementation of NAND boot supported by the Octeon SDK.

2. Technical Differences between NAND and NOR NAND flash和NOR flash的不同

A few technical differences between NAND and NOR flash cause booting to be much different between the two. Here are some of the key points:

NAND does not support random access, while NOR does. As a processor boots it must jump to different code locations and read data from various locations in flash. For NOR flash, the processor can do this by requesting individual bytes from any location in flash. NOR bytes are individually addressable. NAND, on the other hand, does not support direct addressing. Instead the NAND flash responds to commands to read blocks of data in a stream oriented fashion.是否支持随机地址访问

NOR flash guarantees that all blocks can be programmed without error for a specified number of write cycles. This allows data to be stored in NOR flash without any error correction code. NAND does not guarantee perfect blocks, but instead guarantees a maximum number of bit errors per device. Users of NAND must be able to detect and fix single bits errors. Multiple bit errors are unlikely, but still need to be detected.坏块问题

NAND flash supports out of band data for error detection and correction that doesn't fall into the normal linear memory addressing scheme. Unlike NOR, NAND device addresses don't linearly increment and may contain addressing holes between erase blocks.地址是否线性

3. Octeon's NAND Boot Stages octeon的从NAND启动步骤

In order to properly support NAND, Octeon uses three stages to bootstrap a core into the normal ram resident bootloader. These stages are divided up as follows:

3.1. Stage 1 - Initial NAND

The first stage bootloader is fetched by the Octeon processor as soon as the first processor comes out of reset. These instructions are fetch from the first page of NAND in an Octeon proprietary ECC format. 256 bytes, or 64 instructions are fetch from NAND along with 8 bytes of ECC information. After the ECC fixes any correctable errors, this data is cached inside the Octeon NAND controller. These 64 instructions must program the NAND controller and fetch the rest of stage 1. If ECC fails to fix an error in this early boot mode, Octeon will halt execution. For this reason is it highly desirable to keep the amount of code executing in this stage to an absolute minimum.

3.2. Stage 2 - Memory Setup

Since NAND flash blocks support a much lower number of write cycles than NOR flash, parts of boot that generally require tweaks and frequent updates have been moved into a second stage. This stage can be located anywhere in the first 4MB of NAND. Stage 1 searches for this stage looking for a specific header. Once loaded, this stage must setup dram and find the third and final stage. To make application support easier, stage 2 normally loads stage 3 from a JFFS2 filesystem. Stage 2 supports saving its environment to NAND. (For development purposes, stage 2 may be used as the final bootloader, as long as it is small enough to boot from L2.)

3.3. Stage 3 - Final Bootloader

The final stage of booting is responsible for the application visible interface. By default this stage is a U-boot bootloader with fairly full featured access to compact flash, USB, NAND, and PCIe. This stage can be freely replaced with any bootloader a customer wishes. Since dram initialization was performed in stage 2, this bootloader isn't as dependent on board configuration as a normal NOR flash based bootloader.

4. Details of Stage 1

Stage 1 booting entails three logical code sections. In the first section, Octeon is executing directly out of NAND flash. This section must be entirely in assembly and be as small as possible.

Octeon comes out of reset, fetching instructions from NAND block 0, page 0. NAND block 0, page 0内容上电直接被读取
Enable 64bit addressing.
Clear COP0 STATUS[BEV].
Set NAND size parameters for early boot.
Disable Icache prefetching.
Setup UART baud rate based on Octeon's clock.
Print the "Octeon NAND Boot" banner.
Setup L2 cache index aliasing.
Lock the memory where stage 1 is linked at into L2.
Copy stage 1 code from NAND into L2 using the stage 1 link address. Note that the default stage 1 link address is 0xffffffff80400000. It may be changed by editing the stage 1 makefile.
Jump into L2.

The second section is a short amount of assembly executing from L2. This code is simply responsible for setting up a stack and transitioning into C code. Section 2 is the first code that is executing in an environment such that NAND unrecoverable errors can be handled gracefully.

Setup CVMSEG memory for stack space.
Jump into C code at main().

The third section, written in C code is responsible for finding and loading stage 2. Logically this section searches NAND and finds the appropriate stage 2, loads it, and jumps to stage 2.

Install exception vectors to display messages if something unexpected happens. This involves placing the two bootbus moveable regions on the reset vector and the ejtag exception vector. The exception vectors 0x0, 0x80, 0x100, 0x180, and 0x200 are configured to jump to the stage 1 exception dump code. This code displays all registers to the console.
Setup the NAND controller to disable boot mode and switch to the normal command queue mode.
Read GPIO 0 to determine if the failsafe or normal stage 2 image should be loaded.
Search for an image and load it into L2.
If an image was found, jump into it. This completes stage 1 and stage 2 takes over.
If an image was not found and we weren't trying to load the failsafe, try again searching for the failsafe image.
If we still haven't found an image, print a message and halt. A chip reset will be needed to recover.

The exact details of searching flash for an image are as follows: 在flash中查找一个镜像文件的详细描述

Print a message describing the image we are looking for.
Set our current NAND ECC block to zero.
Set our current state to LOOKING_FOR_HEADER.
Read 264 bytes from NAND. Increment our current NAND block.
Use the 8 bytes of ECC data to correct any errors.
If there are uncorrectable errors:
1. If we are in state READING_DATA, print a message that a bad block was found and return to state LOOKING_FOR_HEADER.
2. Go to step 4, reading more data.
If we are in stage LOOKING_FOR_HEADER and there is a valid header:
1. Print a message showing where the header is and the size of the image.
2. Lock the range of addresses needed for the image into L2.
3. Set our L2 address to the one specified in the header.
4. Copy the first 256 bytes of the image to the correct load address found in the header.
5. Change to state READING_DATA.
6. Increment our L2 address by 256.
7. Go to step 4, reading more data.
If we are in stage READING_DATA:
1. Print a message if errors were corrected by ECC.
2. Increment our L2 address by 256.
3. If we've gotten enough data for the image, check the image's CRC.
4. If the CRC matches, then we have a valid image and we're done searching.
5. If the CRC is invalid, then print a message, return to state LOOKING_FOR_HEADER, and reset our L2 address.
If the current ECC block is past the low 4MB of NAND, return failure.

5. Stage 2 Execution Environment 执行环境

When stage 2 assumes control, it can assume the following execution environment:

Processor is in kernel mode with 64bit addressing enabled.
COP0 STATUS[BEV] is clear. Stage 2 should install exception vectors as soon as possible. Note that in order to modify COP0 EBASE, you must enable STATUS[BEV], change the value, and clear STATUS[BEV].
The UART is up and running in fifo mode with the baud rate set.
The stage 2 code is locked into L2 and index aliasing is enabled.
Icache prefetching is disabled.
CVMSEG is enabled and set to 54 lines.

6. Stage 2 Boot Responsibilities ： boot的职责

Stage 2 must perform the following operations before transferring control to stage 3:

Initialize dram.
Unlock and flush all of L2.
Change the size of CVMSEG to zero lines.
Disable access to CVMSEG.
Enable Icache prefetching.
Setup bootbus moveable regions on both the reset and EJTAG exception vectors.
Disable any Octeon specific features. Stage 3 should begin with an environment as close to standard Mips 64r2 as possible. Stage 3 can enable more advanced features if needed.

7. Stage 3 Execution Environment

Stage 3 will begin executing at whatever it's link address is with dram initialized. All Octeon specific features are disabled to maintain maximum Mips compatibility. All of memory is available for use by stage 3.

8. Stage 3 Boot Responsibilities

Stage 3 has no strict requirements for its processing. It may perform any action such as initializing hardware, loading kernels, and booting secondary cores. Although Cavium provides a fully featured U-boot based stage 3, this stage may freely be replaced. For example, VxWorks ROMMON can be loaded instead of U-boot.

9. Final Notes

The three stage design of the Octeon NAND boot process was designed to maximize flexibility with the final boot loader and minimize the hardware dependencies from stage 1. Dram setup in stage 2 allows for a common dram setup to be used on a board even when multiple OSes require different bootloaders. It is strongly recommended that stages 1 and 2 be used with as few changes as possible. All application specific customization should be placed in stage 3.

10. NAND boot hands on：从NAND启动手册

This section will walk through bootstrapping a board with a blank NAND flash chip using a Macraigor EJTAG probe and the Octeon remote utilities. This is a multi-step process that will result in stage1 and stage2 being burned into NAND flash using the Octeon proprietary ECC encoding, and stage3 written to a JFFS2 filesystem using standard NAND ECC. U-boot can write stage1/stage2 and read stage3, and Linux is used to create and write stage3 into the JFFS2 filesystem. The specific examples below show the process for an EBT5200 board, and are run the the base directory of the OCTEON SDK install.

1) Build NAND stage1. This is in the OCTEON-SDK/bootloader/nand-boot directory. Type 'make' in this directory to build stage1. We will use the nand-boot.bin that is built here. Stage1 should not require any modifications unless a new flash device needs to be supported.

2) Connect the EJTAG probe to the board, and power them on. Set the OCTEON_REMOTE_PROTOCOL environment variable to "MACRAIGOR:probe_name_or_address,1000". This will configure the oct-remote* utilities to use the specified probe. Connect to UART 0 of the target board with a terminal program. Stage1 will display boot progress over the UART. Stage2 and the generic RAM based bootloader will boot to a u-boot prompt on the UART.

3) Boot the board using the generic board type. The DIMM SPD addresses must be specified on the command line, and the board delay may be required. (The default board delay value should support many boards, but some will require a different value.)

oct-remote-boot --ddr0spd=0x50 --ddr0spd=0x51  --board=generic

4) Load the nand_boot.bin over EJTAG, and burn it to address 0 in NAND. Any errors will be reported on the serial port.

oct-remote-load 0 bootloader/nand-boot/nand-boot.bin
oct-remote-bootcmd "octnand write 0"

5) Load the stage2 bootloader and burn it into flash. Here we use the generic stage2 that has been compiled for the EBT5200 board. Any errors will be reported on the serial port.

oct-remote-load 0 target/bin/u-boot-octeon_generic_nand_stage2.bin
oct-remote-bootcmd "bootloaderupdate"

The 'bootloaderupdate' command finds a blank space in the low 4 Mbytes of flash and burns the stage2 image. This image has a header that stage1 searches for and uses to validate the image before running it. At this point the board should boot to the stage2 prompt after being reset:

Octeon nand#(stage2)

6) Now we are ready to create the JFFS2 filesystem. To do this, Linux must be booted on the board, and the stage3 bootloader must be available to Linux once booted. If the networking works under Linux on the board, the tftp utilities in the embedded root filesystem can be used to transfer the file over. Alternatively, the stage3 bootloader binary can be built into the embedded root filesystem - this will be described here, as it does not depend on networking on the target board. The generic ram based bootloader, u-boot-octeon_generic_ram.bin, will be the stage 3 bootloader in this example.

Make a directory named 'user-include' in the linux/embedded_rootfs directory, and copy the stage3 bootloader there. For this example, we will use the generic RAM bootloader as the stage3 bootloader.
build the Linux kernel with the embedded root filesystem.
Load and boot the resulting Linux kernel. Note that the loading will take a while over EJTAG (10+ minutes). If networking is functional in the RAM based u-boot on the target board, tftp is a much quicker alternative.
```
oct-remote-boot --ddr0spd=0x50 --ddr0spd=0x51  --board=generic
oct-remote-load 0 linux/kernel_2.6/linux//vmlinux.64
oct-remote-bootcmd "bootoctlinux 0 mtdparts=octeon_nand0:4m(reserved)ro,8m(jffs2)"
```
The stage3 bootloader from the user-include directory will show up in the root directory of the embedded rootfs.
Mount the jffs2 partition, and copy the bootloader to it, then unmount the partition. In the above example, the JFFS2 partition is /dev/mtdblock1. The mapping of MTD partitions to device nodes is listed in the '/proc/mtd' file.
```
mount -t jffs2 /dev/mtdblock1 /mnt
cp u-boot-octeon_generic_ram.bin /mnt
umount /mnt
```

7) Now we need to verify that stage2 can read stage3, and update the 'bootcmd' environment variable in stage2 to start stage3 automatically. We will now reset the board, and let it come to the stage 2 prompt. At the stage 2 prompt, issue the following commands to list the files in the jffs2 filesystem.

jffs2chpart jffs2
jffs2ls

8) Now try manually loading/starting stage 3. The generic RAM based u-boot expects to be loaded at 0x100000, so that is where we load it. Once it is loaded, we jump to the start address with the 'go' command.

jffs2load 0x100000 u-boot-octeon_generic_ram.bin
go 0x100000

The RAM based bootloader should boot at this point, and we should get the following prompt.

Octeon generic(ram)#

9) Now we need to set the 'bootcmd' environment variable in stage2 so that stage3 is booted automatically. Reset the board and let it come to the stage2 prompt.

Octeon nand#(stage2) setenv bootcmd 'jffs2chpart jffs2;jffs2load 0x100000 u-boot-octeon_generic_ram.bin;go 0x100000'
Octeon nand#(stage2) saveenv

10) When reset, the board should now boot to stage 3.

Octeon generic(ram)#

10.1 NAND boot implementation notes

MTD partitioning of the NAND flash is required for creating the JFFS2 filesystem under linux, and is highly desirable for read-only access from u-boot stage2 as well. JFFS2 will attempt to use any erased block it is allowed to use, so the low 4 MBytes must be excluded when writing, otherwise blocks from this area may be used by JFFS2. Also, the u-boot JFFS2 implementation is fairly slow, and scanning large partitions can take a long time. If possible, the JFFS2 partition that contains stage3 should be kept small. The 'mtdids' and 'mtdparts' environment variables are used by u-boot to control the MTD partitioning that u-boot uses. Please see the environment section on the bootloader documentation page for more information on these and other environment variables.
Parameter passing between stage2 and stage3 is done by stage2 placing a u-boot environment structure in DRAM at a fixed location. The next stage can validate this (it includes a CRC) and get any values of interest from it. The structure is simply a CRC followed by a series of NUL terminated strings. Two consecutive NULs terminate the list of environment variables. The only value that is passed by default is the DRAM size, as stage3 needs to get this from stage2 (Or alternatively, it can query the DIMMS and re-calculate this value, but getting it from stage2 is simpler.) Additional information can be easily added to the environment before starting stage3.

11. Porting stage2 to a new board

The second stage of the NAND boot process is responsible for two things: configuring the DRAM controller(s), and starting the final (stage3) bootloader. As such it needs the details of the DRAM configuration for the board, and it needs to be able to read stage 3 from the NAND flash part. The "octeon_generic_nand_stage2_config" is a stage2 bootloader configuration that can be easily modified to support a new board. (If the NAND flash for the board in question is not supported by u-boot, support will need to be added. This NAND support should be developed using the 'octeon_generic_ram_config' DRAM based u-boot configuration as this provides the simplest development environment for u-boot changes.)

The following changes must be made:

update the DRAM_SOCKET_CONFIGURATION #define to reflect the DIMM SPD addresses on the board. This is required for the DRAM initialization code.
Change the BOARD_DELAY parameter to an appropriate value for the board. This is a board dependent value that depends on the board layout. (Additionally, the LMC_DELAY_* parameters should be set once the correct values are determined.)
Set the 'bootcmd' environment variable in the octeon_generic_nand_stage2.h header file. This command will load and start execution of the stage3 (final) bootloader.

The following changes should be made:

Add a new board type to cvmx-app-init.h, and update the NAND_STAGE2_BOARD_TYPE define to use the new board type. This allows per-board customization within the stage2 bootloader if other changes are required.
Adjust the default NAND partitioning as required.

Generated on Wed Sep 8 16:23:58 2010 for Octeon Software Development Kit by