Linux-boot_optimization

I wanted to fiddle around with compiling kernel, u-boot, busybox and merging all into an SD card. While doing that I saw my boot time come to around 7 seconds. The things that caught my eye in the dmesg logs.

Many modules are getting loaded which I don't seem to be using.
Somehow my root file-system is taking too long to mount.

Optimizing Modules

I thought if I could just set the modules to <N> at the time of menuconfig, it would work.

See the thing is, Most of the time we are using defconfig files, with so many inter-dependent config variables, it might get too hard, to consistently build such a kernel configuration. Perhaps you could try, as it might save some milli-seconds in booting, due to reduced size of du -sh /lib/modules or du -sh /boot/zImage.

But the best bet will be to use a custom Device-tree. Unused hardware can be removed from it, which will ensure correlated modules won't be loaded.

Steps:

Go to base Linux directory where you can see arch, scripts. Then save this path. export CURR=$(pwd)
Go to the dts folder. For me it was cd ${CURR}/arch/arm/boot/dts/ti/omap
Considering beaglebone board as the device cpp -nostdinc -undef -x assembler-with-cpp -I./ -I${CURR}/include am335x-boneblack.dts am335x-boneblack.dts.cpp. Now the pre-processed file is created.
Now edit out hardware which is not needed. There is also a way to disable using the keyword status, but at this point I am not sure of where to put the keyword.
Compile the device tree. .${CURR}/scripts/dtc/dtc -@ -I dts -O dtb -i am335-boneblack.dts.cpp -o am335-boneblack.dtb. The @ option is used to help later in dt-overlay support.
Now just copy the output dtb file to your boot device's boot partition.
Now check boot time, you could use this code as init application in

/* File: time_to_boot.c */
#include<time.h>
#include<stdio.h>
#include<unistd.h>
#include<stdint.h>
int main(){
        struct timespec ts;/* time_t tv_sec ; long tv_nsec ; */
        while(1){
                /* dmesg uses Monotonic clock to print logs */
                clock_gettime(CLOCK_MONOTONIC,&ts);
                /* sec is time_t and nsec is long */
                printf("sec:%jd nsec:%09ld\n",  (intmax_t)ts.tv_sec,
                                                ts.tv_nsec);
                sleep(1);
        }
        /* shouldn't reach here */
        return -1;
}

Using this code you can check the boot time, just compile it ${CROSS_COMPILE}gcc -static -o init time_to_boot.c. This command will create a static binary (independent of dynamic libraries). This should be placed inside /sbin, doing so the kernel will look at /sbin/init and spawn it as the first process ( which usually was the busybox symlink).

Checking File-system

In embedded system development, sometimes I saw myself getting stuck due to no console, due to which I had to poweroff using board off button. This might have caused the ext4 file system to get it's sanity broken. Do run fsck /dev/sdxn. Fsck saved me a lot of time. Just due to filesystem issue I was getting a 5 sec boot penalty.

Booting via ram-disk

I felt in the case that rootfs has become broken. At-least there could be a console before that. What if we boot into a stable r/o system, and things such as logging, docs anything related to data tbh, is coalesced into the rootfs. In this way whenever something happens to the rootfs partition, the system is still working. Also another case to think about: Let's assume your device is a camera, the time you push the power-on button, the expectation is to see a viewfinder screen, ready to click a picture. So if I were to look at old photos, I perhaps could give it a wait to load up the disk.

To create a ramdisk we'll have to first create a file and decide on a size. Create the file touch initrd.img. Fill in with zeroes dd if=/dev/zero of=initrd.img bs=1024 count=16000 this will create 16MiB size of a file.
Create a mount point mkdir initrd.mount. And then mount it as a tmpfs sudo mount -o size=16M -t tmpfs initrd.img initrd.mount.
Now you can use it and fill it with utils, such as busybox, modules, vdso.
Unmount it sudo umount initrd.mount. Save initrd.img in boot partition.
In u-boot, edit either the uEnv.txt or u-boot's internal FAT file using shell , the kernel bootargs (command-line parameters) need to be changed to boot from the ramdisk. Change root to root=/dev/ram0.
Also the bootcmd will need to change. If earlier the bootcmd was load mmc 0:1 0x81000000 zImage; load mmc 0:1 0x82000000 am335-boneblack.dtb; bootz 0x81000000 - 0x82000000; , now it needs to be load mmc 0:1 0x81000000 zImage; load mmc 0:1 0x82000000 am335x-boneblack.dtb; load mmc 0:1 0x83000000 initrd.img; bootz 0x81000000 0x83000000:16384000 0x82000000. There is also a way to specify the option initrd= in the kernel command line params, which I will be exploring later.
The init application can now be made to mount the data partition 'root' to the ramdisk.

U-boot's Falcon

In Falcon boot, The SPL (i.e MLO) directly loads the Kernel, DTB, Ramdisk appends the kernel command-line and boots using bootm. The gist is as follows:

Edit flags in uboot menuconfig. CONFIG_SPL_LEGACY_IMAGE_FORMAT=y, CONFIG_LEGACY_IMAGE_FORMAT=y, # CONFIG_SPL_ENV_IS_NOWHERE is not set, CONFIG_SPL_ENV_IS_IN_FAT.
Let board boot as normal, interrupt at u-boot proper's shell. Now load your dtb ramdisk kernel into memory areas.
Now create an args file using spl export fdt <kernaddr> <rdaddr> <dtbaddr>
Now read logs, there will be a final line saying args placed into RAM at some address and ending at another address. So use this addresses and calculate size in hex ,such as 45d3. Then place file in bootfs using fatwrite mmc 0:1 0x<startaddr> args <size>.
Now tell SPL that we want it to boot directly using setenv boot_os 1.
Save!!! saveenv. Note: bootcmd should be appropriately set.
Now you can safely reset.
Voila

Memory technology & speed

We have technologies such as SDMMC, EMMC, NOR-Flash, NAND-Flash, and in them a variety of protocols like OSPI,QSPI,SPI and more. The only thing at the end of the day is to get an evaluation board if you are going to build your own board. Test all interfaces for speed. Also technology upgrade every year. A 5 yr old SD card gives 1MB/s. But a new one with gives around 2-3MB/s. A huge saving. Your kernel,intramfs, dtb files are gonna come around to be more than 7 MB easily.

A better option: fast but small, slow but huge

We can have 2 devices. One fast with small memory footprint, so it costs a bit less, place bootfs and ramdisk. In the other device we have our said rootfs which complements the initramfs. No need to pivot_root, plan your rootfs that it doesn't coincide with initramfs and mount all adjacently. Choose init processes such that it wont require dynamic libraries. If you want those dynamic libraries then store it in the faster, now larger memory device.

Pack the kernel

Compression statistics is a deeply debated and fuzzy topic to ascertain. Try running benchmarks on different methods. Compare all with load time + decompression time.

A, trust me bro, advice: Pls dont directly load vmlinux, that's incredibly slow as the bottleneck is not decompression but storage to memory loads.

Future work

Finding memory devices' speeds and getting a table ready for beaglebone black. I also got myself a pocket beagle 2, it's bootflow involves 3 different processors an R5, M4, A53, (Hmm.. that's more work to do).

Conclusion

I saw a boot time of 0.5 seconds from the u-boot handoff to init program. There also is path to reduce u-boot overhead, which also should be seen to. In my personal opinion, for embedded devices, u-boot is good for development. But for shipping product u-boot is quite chunky. I feel the possibility of using Falcon should be considered, I saw a second of overall boot time improvement. of beaglebone), that initializes the DRAM and only supports 2 commands load and bootz.

by Sidharth Seela