Search Results | Whileone

Blog Posts (75)

Other Pages (23)

75 results found with an empty search

Bring up Yocto for RISC-V deployment
We at Whileone Techsoft pvt ltd understood the requirements of our customer who wanted to have a basic Yocto based RiscV deployment for their custom SoC chip. The customer intended to share this basic deployment with their clients who wished to make use of our customer’s SoC in their products. Our customer was unaware of Yocto and what was needed to ensure a favorable deployment. They had their own custom patched Linux kernel, Root file system, Toolchain and custom Bootloader and a custom simulator as well to boot the final image. Their client insisted on Yocto instead of their default BuildRoot deployment. As Yocto has its own tools, compiler and dependencies, the challenge was to ensure the final Image generated by Yocto was compatible enough to be run by their custom simulator. Introduction to Yocto: With the Opensource Yocto Project, we can create custom Linux based systems for embedded products. It is quite possible to tailor the Linux images as per requirements with a set of flexible tools and friendly customizable scripts. Yocto provides a reference embedded distribution called ‘Poky’ that was used for this project The customer’s custom patched Linux Kernel was of a much smaller version than the current that was available in the kernel.org website. So, initially when we went with the latest Yocto version (Mickledore, v4.2) which featured GCC compiler version 12.x, we got errors during Kernel build. The errors pointed to some unknown assembly instructions. The reason was that our custom kernel version was old and it wasn’t updated. As the Customer was already using GCC 11.x in their build infrastructure, we did a search for a match of Yocto version that provided the nearest GCC 11.x and that was found to be Yocto Honister (v3.4). Initial test build was successful with Honister and so we finalized this version before moving ahead. Yocto uses Bitbake as its build tool. So, whenever we plan to create recipes in Yocto, we should create a separate folder inside poky that starts with “meta-” as per the Yocto manual. Also, referring to similar meta folders like meta, meta-yocto-bsp, meta-poky; we came up with our own “meta-riscv-custom”. To add a new meta layer, make use of bitbake commands, such as the one given below, $> bitbake-layers add-layer meta-riscv-custom Yocto Configuration options: As we were using the sample Poky distribution of Yocto and to let Poky know that we intend to use our custom “meta-riscv-custom” folder in the build process, we have to update a file “bblayers.conf” in the build/conf directory. This build/conf directory is generated after we initialize the environment by executing “source oe-init-build-env” in the Poky root folder. Also, we have to modify the variable “MACHINE” among others in the location “build/conf/local.conf ” to “qemuriscv64” and comment out the default value. There are other options in the file “local.conf” that we can modify to get image output in a desired format. The variable IMAGE_FSTYPES = “tar cpio” will generate the image in both tar and cpio formats. This is especially useful when we want to generate a root file system in this format. Creating recipes: Recipes are like script files that are created under the meta- folders. Files like “ riscv-linux.bb ” which is a recipe for building Linux kernel, “ riscv-boot.bb ” for building bootloader and so on. Custom changes: The customer was also interested to know how one could add a custom directory and files and make custom changes to existing files in the file system. Yocto has its own package group recipe file “ packagegroup-core-boot.bb ” that can be modified. For example, 1. We can disable UDEV by commenting it out 2. Similarly, we can also comment out HWCLOCK in the same file To create a custom folder “custom-riscv” inside root (“/”) and a file named “custom.conf” with some configuration options and comments, we had to modify a recipe file “ base-files_3.0.14.bb ”. Build: Yocto uses Bitbake as its build tool. To build, we make use of the following commands. $> bitbake -cclean riscv-linux $> bitbake riscv-linux The above command skips the extension “.bb”. Also, if we had not added the folder path of “meta-riscv-custom” to bblayers.conf, then we would get an error here after running the above command. Build artifacts: The artifacts are generated in the work directory under the path, Poky/build/tmp/work/riscv64-poky-linux/riscv-linux/1.0-r0/custom-linux/*
GCP Cloud Performance: Time-Based Score Variations
In May 2022, one of our customers asked us to tune Elasticsearch with Esrally for cloud providers. We started with trying multiple combinations of manual runs on all cloud providers. We were collecting scaling runs with 2/4/8/16 cores. In the above data collection, we could not see the proportionate scores. Hence, we decided to experiment with running the Elasticsearch ESRally benchmark throughout the day. As Esrally doesn’t run for a particular duration, we carried out the runs 50 times so that it will span a whole day. And here is what we saw! Used configurations are: Altra - t2a-standard-16 Intel Icelake - n2d-standard-16 Milan - c2d-standard-16 Elasticsearch 8.4.1 Esrally 2.6.0 Server Altra . Intel Icelake. Milan . Client Altra Altra Altra Variation is observed according to time of day. AMD is the best where SD is lowest. But Intel and Altra show large standard deviations. NGINX-wrk benchmark also shows such behaviour on GCP. NGINX- wrk runs are carried out 1440 times keeping each run ‘s duration 60 seconds. Variation in p95 latency is observed through the time of day. Both Intel and Altra show 10% standard deviation in p95 latency numbers. Do consider Time-based Score Variations before running network applications: Time of day does affect latency since neighboring VMs might be busy or idle depending on the time of day. Run-to-run variation is a function of time of day. Eventually, we were able to help the customer figure out where the performance difference was coming from. To ensure a specific output through the day the scaling of VM has been suggested.
Network compute agnostic Performance Analysis for Cloud workloads
At Whileone we take pride in customer's success. We help customers achieve goals and execute out of the box ideas that are necessary for success. One such project was to get IPCs for cloud applications on different architectures, completely omitting network stack. This would give the RISC-V chip designing customer a good picture whether architecture IPC ( Instructions per Cycle ) is inline with competition like Intel or ARM. To achieve this, we modified cloud applications to profile and benchmark the performance with no network or socket calls. The idea here was to see performance of different architectures with vanilla versions and lite ( modified ) versions. This would help the customer run these applications on their simulator. This would help them to get the IPC number of that architecture for that application and compare it with the competition. To give you an example, one of the applications we picked was Redis -a cached server application. Redis takes SET/GET requests from clients and processes those internally to keep cached copy for quick response. To get away with the network part, we simulated the client and to look like Redis has N SET/GET requests and processed those. Now the performance numbers we have are solely for that application processing on that architecture. This helped eliminate network noise and get a good picture of what the IPC is for core application processing. Table below shows the IPC Redis vs RedisLite. Drop in IPC can be attributed to networking sockets being removed. SET Redis Redis-Lite Graviton2 Intel 8275 Cascade Lake Graviton2 IPC 0.94 0.69 1.76 Icount / packet ~39000 ~30000 ~20200 In doing so, we made sure that we do not modify program logic and core behavior of the application in any way. We could see a similar call stack in case of Redis and Redis Lite. Below are the snapshots. REDIS Flamegraph REDIS-LITE Flamegraph As evident from the flame graphs- the call stack of the core application is not altered. In the Redis-lite flamegraph, the network component is absent. Redis is a single threaded application. We helped the customer port various multi-threaded / multi-process applications. The customer was able to cross-compile these application and run it on the it’s RISC-V simulator. This was an interesting experiment from the performance numbers point of view and useful for the customer in the early phase of chip development. This helped the customer to understand where they are placed with respect to the competition.
Oracle Optimized BLIS Libraries for Ampere Altra Family
Basic Linear Algebra Subprograms(BLAS) and BLAS Like Interface Software(BLIS) are libraries that can accelerate mathematical operations on current CPU microarchitectures. As a part of the FLAME project , BLIS was introduced to handle the dense linear algebra software stack. The framework was designed to isolate essential kernels of computation that, when optimized, immediately enable optimized implementations of most of its commonly used and computationally intensive operations. BLIS offers enhanced performance for cases of matrix multiplications where the operands are small. BLIS supports both, single and multi-threaded modes of operations. Oracle in its efforts has optimized the BLIS libraries for exceptional performance on Ampere Altra Family of processors. Let us look at how we can leverage this to our benefit and what sort of performance boost can be expected. Step 1: Getting BLIS Sources git clone https://github.com/flame/blis.git cd blis git checkout ampere Step 2: Building BLIS ./QuickStart.sh altramax #Change ./QuickStart.sh altramax to ./QuickStart.sh altra if building for Ampere Altra #processors source ./blis_build_altramax.sh source blis_setenv.sh export LD_LIBRARY_PATH=/lib/altramax Note: BLIS can be built for OpenMP(default) or pthreads. Details can be found in documentation/tutorial. Step 3: Performance Experiments For our test, we will be using HPL- 2.3, a High Performance Linpack benchmark that is commonly used to test systems. We will be comparing performance of Oracle BLIS library with OpenBLAS and Arm-PL System Config: OS: Ubuntu 22.04 Kernel: 5.19.0-46-generic Toolchain: gcc (GCC) 12.3.0 Memory: 16x32GB Results: For HPL, the Oracle optimized BLIS libraries provide 1.2 times boost in performance.
Our experiences with running Strapi in cluster mode
One way to scale a Node based system is to run multiple instances of the server. This approach also works well for Strapi because Strapi doesn't store anything in-memory on the server-side (no sticky sessions). The JWT tokens it issues persist in the database. So, any time we observe a Strapi setup struggling to handle the load of requests on the Node side, we add more instances running the same code. For setups with a predictable workload, pm2 offers a simple way to manage multiple Strapi server processes. However, when we ran Strapi in cluster mode via pm2, we realized we needed to be careful about a few things that we didn't encounter when running a single Strapi instance: 1. Issues encountered 1.1 Strapi schema changes at startup With Strapi, the schema is stored within our code. As a result, every time Strapi starts, it ensures the underlying database schema is brought in-sync with the schema defined in code. Additionally, any data-migration scripts (stored within `database/migration/` folder within the code repository) are run at Strapi startup (if not previously run). Because of the above design, when we ran Strapi in cluster-mode, it caused more than one Strapi processes to trigger these database-side changes. This lead to issues we had not observed running a single Strapi instance. 1.2 Strapi cron jobs Strapi can be [configured]( https://docs.strapi.io/dev-docs/configurations/cron ) to run cron jobs. This is a very helpful feature because it allows us maintain our task scheduling related setup along with our CMS setup (no separate code repository, infrastructure or devops for CMS-specific scheduled jobs). However, when we ran Strapi in cluster-mode, each of the running Strapi instances triggered the scheduled cron jobs. As a result, on our setup with four Strapi instances, a scheduled task to trigger email alerts ended up sending four emails! 2. Solution Approach To solve the above detailed issues, we realized we wanted a solution that would help us achieve the following: A way for a running Strapi instance to identify itself either as a `primary` or a `secondary` server. Doing so would allow us to only have the `primary` instance perform tasks like triggering cron tasks. A way to initialize only a single Strapi instance first. Rest of the Strapi instances should start only after the first one is up and running. This would ensure that the initialization tasks like running database migration scripts or model schema sync aren't performed more than once. 3. Implementation 3.1 Segregating the running Strapi instances into primary & secondary We achieved this with `pm2` variable called `NODE_APP_INSTANCE`. When `pm2` starts any node process, it assigns a unique & incrementing value to `process.env.NODE_APP_INSTANCE` to each of the started node processes. The first started instance would have the value `0`. The second instance would have the value `1` and so on. So, with the following check, a running Strapi process could identify if it was a `primary` or a `secondary`: const { sendEmailAlerts } = require('./cronSendEmailAlerts'); module.exports = { sendEmailAlerts: { task: async () => { if (typeof process.env.NODE_APP_INSTANCE === "undefined" || (typeof process.env.NODE_APP_INSTANCE !== "undefined" && parseInt(process.env.NODE_APP_INSTANCE) === 0)) return await sendEmailAlerts(); else return false; }, options: { rule: "59 11 * * *", }, } }; 3.2 Controlling the sequence of starting Strapi instances To start only a single Strapi instance first and start the other instances later, we leveraged `pm2` API called `sendDataToProcessId()`. This API enables inter-process communication between various pm2 initialized processes. Hereby, instead of starting Strapi via the regular `strapi start`, we wrote a script where: The first Strapi instance could start right away but the other Strapi instances would wait for a signal from the first instance. The first Strapi instance could send a signal to rest of the Strapi instances once it is up and running. #!/usr/bin/env node 'use strict'; const strapi = require('@strapi/strapi'); const pm2 = require('pm2') let performStrapiStart = false; //logic for starting the primary instance if (parseInt(process.env.NODE_APP_INSTANCE) === 0) { if (!performStrapiStart) { //Start the primary Strapi instance performStrapiStart = true; strapi().start(); } pm2.list((err, list) => { const procStrapi = list.filter(p => p.name == process.env.PM2_APP_NAME); //We check every 500 msec if Strapi started const intervalCheckPrimaryInit = setInterval(function(){ //global.strapi.isLoaded turns to true once Strapi is running if (global.strapi.isLoaded) { clearInterval(intervalCheckPrimaryInit); //Time to communicate rest of the running Strapi instance processes //to start Strapi for (let s=0;s<procStrapi.length;s++) { if (parseInt(procStrapi[s].pm2_env.pm_id) !== parseInt(process.env.NODE_APP_INSTANCE)) { pm2.sendDataToProcessId(procStrapi[s].pm_id, { data : { primaryInitDone : true }, topic: 'process:msg', type: 'process:msg' }, (err, res) => { if (err) console.log(err) }); } } } }, 500); }); } //logic for starting the secondary Strapi instances else { process.on('message', function (data) { if (!performStrapiStart && data.data.primaryInitDone) { performStrapiStart = true; strapi().start(); } }); } On starting Strapi via the above script using `pm2` in cluster mode, we could now control the sequence of starting of Strapi instances. 4. Conclusion Running Strapi in cluster mode via `pm2` allows us scale our CMS setup. But, having more than once Strapi instances running can cause some issues. Hereby, having an ability to uniquely identify each running Strapi instance and enable inter-process communication between them allows us to adequately solve any such issues resulting from a multiple-instance setup.
Using Google Charts in a Dynamic Way - How Using Google Charts Allowed Flexibility in a Short Dev Time Window
We had a requirement to build a charting facility that could provide several charts. The requirement implied that we needed to support several different charts, but those charts weren’t really defined, and flexibility was required. Our setup was a headless CMS ( Strapi.io ) and NextJS for server-side-rendered and statically generated pages. We found react-google-charts to be an interesting library. For any chart, this library requires inputs as chart-type, data, width and height. Our workaround to deliver this overnight was to add a JSON field in CMS for accepting these parameters and on the frontend, pass these to the react-google-charts. Implementing it in this way implied we could support 100% of what the react-google-charts supports. Given that most content managing stakeholders weren’t from a web development background, we documented and trained them on how they could leverage this feature. This again was made extremely simple by react-google-charts live examples . Using any one example which suited the requirements, the content managers could view the chart live, then open the sandbox, tweak data as required, then create a JSON from the data and chart-type. Initial few charts did take development time to figure out things like tweaking the colors, having two y-axes with values in different units and modifying the content of the hover bubble. However, after initial 3-4 charts, such additional things were repetitive and the content managers could easily figure things out on their own by referring to these initial 3-4 charts and using the sandboxes that react-google-charts provide.
UI/UX and Graphic Design: Creating Seamless Digital Solutions
Effective UI/UX and graphic design are crucial for building user-friendly applications in today's digital world. At Whileone Techsoft Pvt. Ltd., we blend creativity with technical expertise to deliver visually appealing, high-performance products that drive business success. Our UI/UX and Graphic Design is all about: Why Us? We leverage a diverse set of design and development tools to create high-quality visual and interactive content. Our expertise spans various creative processes, ensuring that each project is enhanced with tailored, impactful visuals and elements that align with the client's goals. We deliver UI/UX and Development Together: As UI/UX and development are interconnected, our Designers create wireframes and prototypes that serve as blueprints for developers. Our expertise in tools like React and Vue.js facilitates this transition, ensuring a smooth development process. Collaboration between design and development teams results in visually consistent and high-performing applications. Our Experience in the Semiconductor Industry: Our UI/UX designs help simplify complex systems. Our services, where development and design go hand in hand, helped the customer deliver data with precision. Intuitive interfaces we created made visualizing and interacting with intricate technical data easier, reducing user error and enhancing productivity. … And customers achieved seamless improvement in their processes. CONTACT US FOR: At Whileone Techsoft Pvt. Ltd., we specialize in user-centric UI/UX and graphic design that aligns with business goals and technical needs. By working closely with development teams, we deliver cohesive digital products that enhance user engagement, improve workflows, and boost performance. Whether you're in any technical field, our design solutions simplify complex ideas with precision and clarity.
Predicting Differential Loss at the Edge: Lightweight ML for Real-Time Test Intelligence
Inspiration In high-throughput production environments, every sensor reading tells a story. Test systems continuously record Pressure , Temperature , and Differential Loss (DL) across thousands of cycles, but much of this data remains passive, observed but not interpreted. We set out to change that by deploying machine learning directly at the edge on a BeagleBone Black board. The goal was not anomaly detection, but live inference : to compute what the ideal DL should be (DL_pred) under current conditions and instantly compare it to the measured DL. The outcome was a self-aware test station capable of interpreting its own sensor data in real time. Use Case: Predicting DL via Edge Inference During each test cycle, the system measures: Pressure (P): applied load during testing Temperature (T): ambient or component temperature Differential Loss (DL): observed pressure decay Because DL depends heavily on both T and P, fixed thresholds can mislead operators when environmental drift occurs. Our solution trains a regression model that learns the baseline relationship between these variables and deploys it locally to predict DL_pred for every new test. At runtime: The sensors stream T and P values to the model. The model infers DL_pred = f(T, P) in real time. The system computes the deviation: Deviation=DLactual−DLpred\text{Deviation} = DL_{actual} - DL_{pred}Deviation=DLactual−DLpred This enables contextual interpretation, distinguishing true defects from environmental variation instantly, without recalibration or cloud dependence. Mathematical Foundation: Ridge Regression at the Edge We model the relationship as: DL=β0+β1T+β2P+ϵDL = \beta_0 + \beta_1 T + \beta_2 P + \epsilonDL=β0+β1T+β2P+ϵ Since T and P often correlate, we apply Ridge Regression with L2 regularization: Loss=∑i=1n(DLi−DLi^)2+λ∑j=1pβj2\text{Loss} = \sum_{i=1}^{n}(DL_i - \hat{DL_i})^2 + \lambda \sum_{j=1}^{p}\beta_j^2Loss=i=1∑n(DLi−DLi^)2+λj=1∑pβj2 Why Ridge Regression? Stabilizes results under multicollinearity Penalizes large coefficients to avoid overfitting noisy sensor data Lightweight and suitable for low-power boards Explainable, as coefficients show how T and P affect DL Easily portable to TensorFlow Lite for edge inference Experiment Methodology 1. Data Acquisition and Pre-processing Gathered Pressure and Temperature from onboard sensors Collected DL from completed test cycles Aligned data by timestamp (HH:MM) Filtered operational ranges (Temp 46–48 °C, DL 20–32) Exported cleaned_pressure_data.csv for model training 2. Model Training (Offline) Algorithm: Ridge Regression (DL ~ Temp + Pressure) Validation: PCA and Mutual Information for feature strength Conversion: TensorFlow Lite FP32 model via Docker docker run --rm -it -v "$PWD":/work -w /work tensorflow/tensorflow:2.4.0 bash 3. Edge Inference (Runtime) Deployed on BeagleBone Black using tflite-runtime with Python 3.9. import tflite_runtime.interpreter as tflite interpreter = tflite.Interpreter(model_path="ridge_linear_fp32.tflite") interpreter.allocate_tensors() At each cycle: Read T and P in real time Feed inputs into the model Run inference to generate DL_pred Compare DL_pred with DL_actual to compute deviation DL_pred is generated dynamically after each inference cycle, not pre-calculated. 4. Diagnostics Interface A built-in local web dashboard provides: Real-time DL vs DL_pred visualization Network configuration (DHCP/Static) CPU usage, logs, and debug metrics Results Metric Description Outcome Model Type Ridge Regression (L2) Lightweight and robust Device BeagleBone Black ARM Cortex-A8 CPU Inference Latency Time per DL_pred computation 15 ms Prediction Accuracy Mean absolute error ±1.5 DL units Memory Usage Runtime footprint < 40 MB Network Dependency Fully local operation Edge inference at 15 milliseconds per cycle delivers immediate feedback to operators, enabling process decisions before the next test unit enters evaluation. Key Advantages Real-time predictive insight at the data source Eliminates false rejects caused by ambient drift Explainable regression coefficients for auditability No cloud latency, ensuring on-bench decision-making Minimal resource consumption, scalable across multiple setups Future Enhancements Expansion to multi-sensor fusion (temperature, torque, flow, vibration) Integration of non-linear regressors or compact neural networks for complex patterns Incremental learning for continuous self-calibration Visualization through Grafana dashboards for centralized monitoring Conclusion By generating the predicted DL (DL_pred) directly on-device after each inference cycle, the system evolves from a static tester into a real-time predictive platform. This architecture minimizes false rework, enhances test reliability, and demonstrates that intelligence can reside within the manufacturing floor rather than in remote data centers. Fifteen milliseconds is all it takes to transform raw sensor data into actionable insight at the edge.
Major Takeaways from RISCV NA Summit 2025
1. The Software Ecosystem is Now the Core Focus The most significant shift was the overwhelming emphasis on software, tools, and developer experience. Platform Mindset: Keynote speakers, including executives from major players, stressed the need to view RISC-V not just as an ISA (Instruction Set Architecture) but as an ecosystem that requires platform-level thinking. The message was clear: no single company can build the entire software stack alone; continued, sustained community collaboration is essential for scaling. "Paved Road" for Datacenters: Google highlighted its efforts in creating a "paved road" for RISC-V adoption, including using AI-driven tooling to automate the complex process of porting their software stack from proprietary architectures to RISC-V. This signals that major hyperscalers are actively engineering solutions to remove friction for developers. 2. High-Performance Computing & AI Get a Massive Boost RISC-V's expansion into high-end compute was the major theme, driven by announcements from key hardware and software vendors. Data Center and Chiplet Traction: Companies showcased their progress with "real silicon, real systems," emphasizing their full-stack approach to high-performance RISC-V for data center and automotive platforms, often using advanced chiplet-based designs. 3. The Rise of Vertical-Specific Dominance The Summit showcased clear evidence of RISC-V achieving market dominance in specific vertical industries beyond its traditional embedded roots. Aerospace & Defense: NASA's presence and keynotes highlighted the critical role of RISC-V in the High Performance Spaceflight Computing (HPSC) initiative, with radiation-hardened Microchip processors (using SiFive cores) becoming the standard for next-generation space missions. Security & Sovereignty: Keynotes explored the convergence of RISC-V with modern cryptography and blockchain technologies, demonstrating its potential to power the next wave of secure, decentralized systems and enhance technological sovereignty for nations and enterprises. 4. Standardization Efforts Mature (RVA23 and Beyond) Technical standardization gained clarity, providing a more stable target for both hardware and software development. RVA23 Stability: The focus was on the recently ratified RVA23 Profile (RISC-V Application Profile 2023), which provides a stable baseline for application-class processors. The community signaled a move toward incremental updates (like RVA23p1, RVA23p2) rather than annual major releases, which helps stabilize the software ecosystem. Developer Training: The addition of an official, separately ticketed Developer Workshop track and a RISC-V 101 track showed the community's commitment to aggressively onboarding new talent and accelerating the application of the open standard. 5. New Open-Source Development Programs A major announcement focused on making RISC-V hardware and software more accessible to the global open-source community: DeepComputing's Global RISC-V Support Programs: DeepComputing launched a major initiative designed to accelerate open innovation by providing hardware and ecosystem support to three key areas: "100 Open Source Projects" Program: This is specifically designed to support open-source communities with RISC-V hardware (like the DC-ROMA AI PC), testing environments, and collaboration opportunities to drive upstream contributions to the RISC-V software stack. The initiative also includes "100 Schools & Universities" and "100 AI Startups" programs, broadening the use of open RISC-V platforms. The Scaleway Labs Elastic Metal RV1 (EM-RV1) is a notable open-source platform, primarily because it's the world's first dedicated RISC-V server offering in the cloud . 1 Scaleway, a European cloud provider, launched this offering in early 2024 as part of their Scaleway Labs to enable developers and companies to easily test and develop on the RISC-V architecture. While it wasn't launched at the RISC-V Summit North America 2025 (as its launch date was earlier), it represents a major milestone for the open-source ecosystem, providing essential cloud infrastructure for RISC-V software development, and was certainly a topic of discussion at the summit. Key Specifications and Open-Source Relevance Feature Details Open-Source Relevance Type Bare Metal (Dedicated) Server in the Cloud Full control of the RISC-V hardware, ideal for kernel development and low-level testing. SoC Alibaba T-Head TH1520 Utilizes an open-source-friendly processor with an RISC-V core. CPU 4x T-Head C910 RISC-V 64GC cores @ 1.85 GHz Provides a modern, multi-core RISC-V development environment. RAM 16 GB LPDDR4 Sufficient memory for building and testing complex applications. Storage 128 GB eMMC Basic storage, consistent with its use as an affordable development/CI/CD platform. Operating Systems Debian, Ubuntu , Alpine Linux Support for major open-source Linux distributions, highlighting software ecosystem maturity. AI Capabilities Integrated NPU (4 TOPS @ INT8) Allows for testing and development of AI/ML workloads on the RISC-V architecture using open-source frameworks like TensorFlow and ONNX. Design Designed and assembled in-house in Paris (Scaleway Labs) A commitment to technological independence and fostering the European RISC-V supply chain. The ISCAS RUyiBook (or Ruyi Book) is a RISC-V-based laptop , developed through a collaboration that includes the Institute of Software at the Chinese Academy of Sciences (ISCAS) , Milk-V, and Inchi. It is a significant project in the open-source hardware and software community, aiming to create a fully functional, mainstream-capable computing platform based on the open-standard RISC-V Instruction Set Architecture (ISA). Key Features and Specifications The RuyiBook is an effort to demonstrate the maturity of the RISC-V ecosystem for general-purpose computing. Component Details Significance Processor (SoC) XiangShan Nanhu (second-generation) An impressive high-performance, open-source RISC-V chip design. CPU Clock Up to 2.5 GHz This clock speed pushes RISC-V toward performance parity with established x86 and ARM architectures for mainstream tasks. Memory 8GB DDR5 Utilizes modern, high-speed memory for better system performance. Graphics AMD RX 550 (Discrete GPU) Uses a closed-source but powerful discrete GPU to handle modern graphical workloads and external displays up to 4K resolution. Operating System Primarily runs openEuler OS (with EulixOS 2.0-RV/PolyOS 2.0-RV desktop options) Showcases a full, streamlined RISC-V software stack, from the bottom-layer processor to large-scale office software like LibreOffice. Goal Technological Independence Part of a broader effort in China to reduce reliance on proprietary foreign technologies (like x86 and ARM) by leveraging the open and royalty-free nature of RISC-V. 6. The QiLai SoC Chip The heart of the platform is the QiLai System-on-Chip (SoC) , which is a test chip manufactured on TSMC's advanced 7nm process technology . It features a heterogeneous computing architecture, combining two different types of high-performance Andes RISC-V cores: Component Description Target Application Main CPU Cluster (AX45MP) A quad-core cluster of RV64GC 64-bit processors. It has an 8-stage superscalar pipeline, a Memory Management Unit (MMU) , and a 2MB Level-2 cache with a coherence manager. Running rich operating systems like Linux (including a Linux SMP system) and general-purpose application processing. Vector Processor (NX27V) A dedicated RV64 GCV 64-bit vector processor with a streamlined 5-stage scalar pipeline and a large data cache. It features a 512-bit vector length (VLEN) and data path width (DLEN). High-throughput data processing and acceleration for AI/ML workloads . Performance The AX45MP can run up to 2.2 GHz , and the NX27V up to 1.5 GHz . The entire SoC has a low power consumption of approximately 5W at full speed. 7. The Voyager Development Board The QiLai SoC is integrated onto the Voyager Development Platform , a Micro-ATX form factor motherboard. This board provides a full PC-like environment for developers, including: System Memory: Support for up to 16GB of external DDR4 memory. Storage: M.2 NVMe SSD support and MicroSD card socket. Expansion: Multiple PCIe Gen4 slots (x16, x4) for integrating peripherals like external GPUs, SSDs, and AI accelerator cards. 8. Target Applications and Ecosystem The QiLai Platform is a crucial step in maturing the RISC-V ecosystem for high-end computing. Its target applications include: AI/ML and Edge AI: The heterogeneous architecture allows the AX45MP to run the main OS while the NX27V is dedicated to accelerating machine learning inference and training. High-Performance Computing: General-purpose computing, augmented reality (AR), virtual reality (VR), and multimedia processing. RISC-V PC Development: The platform is the foundation for collaborative projects, such as the effort with DeepComputing to develop the "World's First RISC-V AI PC" running Ubuntu Desktop . The platform is supported by a full software stack, including the OpenSUSE Linux distribution, Andes' toolchains ( AndeSight ), and their dedicated AI/ML SDK ( AndesAIRE NN SDK ). The SiFive HiFive Unmatched is a high-performance RISC-V development platform designed by SiFive to facilitate the creation and porting of software for RISC-V-based desktop and server applications. It is notable for being one of the first RISC-V development boards to adopt a standard PC form factor, making it much easier to integrate into a standard computer enclosure with common peripherals. Key Features and Specifications Component Detail SoC SiFive Freedom U740 (FU740) CPU Architecture Heterogeneous Multi-core: A cluster of five 64-bit RISC-V cores. Cores Quad-core SiFive U74-MC (U-series are Linux-capable application cores) and Single SiFive S7 (S-series is a real-time monitor core for auxiliary/deterministic tasks). Core ISA RV64GC ($\text{RV64IMAFDC}$) for the U74 cores, RV64IMAC for the S7 core. Frequency Up to 1.2 GHz (initial releases), with later revisions capable of higher speeds. Cache 2MB Coherent Banked L2-Cache , plus L1 caches per core. Form Factor Mini-ITX ($170 \text{ mm} \times 170 \text{ mm}$), enabling use with standard PC cases. System Memory 16 GB of 64-bit DDR4 DRAM. Expansion Slots 1x PCI Express Gen 3 x16 connector (with 8 lanes useable) for graphics cards or accelerators. Storage 1x M.2 M-Key (PCIe Gen 3 x4) for NVMe SSD . Connectivity Gigabit Ethernet (10/100/1000 Mbps), 4x USB 3.2 Gen 1 Type-A ports, M.2 E-Key for Wi-Fi/Bluetooth. Power Standard 24-pin ATX power connector . Software Ships with a bootable SD card containing the Freedom U-SDK (based on Yocto/OpenEmbedded Linux), OpenSBI, and U-Boot. It is supported by various Linux distributions like Debian and openSUSE.
Success Story: How We Built a Trusted SRE Partnership with Our Client
In the world of Site Reliability Engineering (SRE), trust, knowledge, and execution matter more than anything else. When our team was presented with the opportunity to support one of the leading clients in the inference systems domain, we knew the competition would be fierce. Many well-established and much larger organizations were bidding for the same project. Yet, we saw this as an opportunity to prove that expertise, dedication, and the right approach can outweigh size and scale. Despite being a relatively small organization, we brought to the table something unique: deep benchmarking expertise and domain knowledge that matched the client’s needs. Our ability to quickly understand complex systems, connect the dots across data center operations, and build solutions made us stand apart. This expertise, combined with our willingness to adapt and learn, enabled us to win the contract and take on the responsibility of L1 support for their uptime systems, a task critical to their business continuity. Early Learning Curve: Building Strong Foundations for SRE The first few months were not easy. As with any complex system, the uptime infrastructure required us to climb a steep learning curve. We had to quickly grasp: How incident workloads function in production. The architectural blocks within the inference ecosystem. The hosting mechanisms, including the structure of the client’s data centers. The different ways the system could fail and the potential impact of each failure mode. Every shift brought new learning opportunities. We immersed ourselves in understanding not just what went wrong, but why it went wrong. Slowly but steadily, our knowledge grew. Each incident became a case study, and each interaction with the client’s engineers enriched our understanding. This was the foundation upon which the rest of our success was built. Shadow-to-Primary: Transitioning to Responsibility In the beginning, we worked in 24x7 rotational shifts , shadowing the client’s engineers, who acted as the primary on-call. Whenever an incident occurred, we would huddle with their team for hours, studying every aspect of the problem. From root causes to resolution steps, we ensured that we not only solved the issue but also understood its overall architectural implications. This approach gave us a top-to-bottom view of the system. We became aware of dependencies, escalation paths, and the critical importance of maintaining near-zero downtime , especially since the client’s end customers had strict SLAs. A few weeks later, roles were reversed. We stepped into the position of primary on-call , while the client’s engineers moved into a shadow role. This was a defining moment for us — it was proof of the trust the client had started to place in our abilities. From that point onward, we took ownership of incidents, evaluated dependencies, and escalated to higher-level (L2/L3) teams when necessary. Our timely and correct escalations saved the client from SLA violations in at least two critical cases. By reducing downtime significantly during these incidents, we demonstrated our ability to not only react but also safeguard business continuity . Innovation: Building Dashboards & Monitoring Tools As we settled into our responsibilities, we realized that the existing tools were not enough for the kind of proactive monitoring and reporting we envisioned. To bridge this gap, we took the initiative to build custom dashboards that provided visibility and actionable insights. Shift Dashboard : Displayed current on-call engineers, open issues, resolved cases, and escalations in real-time. Incident Dashboard : Showed day-wise, model-wise, and data center-wise incident trends — becoming an essential tool for weekly analysis. Weekly Summary Dashboard : Automatically generated detailed reports of the past week’s incidents, including escalation data and issue patterns. These tools were not part of the original scope, but we believed they were necessary to add value. Over time, they became integral to the client’s weekly analysis process, simplifying their workflows and enhancing decision-making. Continuous Learning & Adapting to Change Prediction management systems are dynamic by nature. Weekly deployments, new models, and constant updates meant that the environment was never static. We set up processes to stay on top of these changes, ensuring that our knowledge was always current. Regular huddles, review meetings, and knowledge-sharing sessions with the client’s engineers became part of our routine. This collaborative approach kept both sides aligned and allowed us to respond quickly to changes in logs, architecture, or deployment practices. Within 5–6 months, we had grown from a team learning the ropes to a confident, trusted partner capable of handling L1 responsibilities independently while also delivering value-added innovations. Challenges Faced and Overcome The journey was not without challenges. We encountered: New types of incidents : Each time we faced something new, we documented the issue and resolution steps, building a repository for future reference. Frequent deployments : Required us to stay agile and adapt our processes weekly. Multiple models and new data centers : Added layers of complexity to monitoring and incident handling. Incident spikes : At times, a single 8-hour shift would see a barrage of incidents. Our on-call engineers handled these calmly, prioritizing issues, escalating appropriately, and ensuring system stability. Each challenge was an opportunity to refine our processes, strengthen our knowledge, and enhance the value we delivered to the client. Conclusion: A Journey of Trust and Value Looking back, what began as a competitive bid against larger players turned into a remarkable journey of trust, growth, and success. In just a few months, we evolved from observers to primary guardians of system reliability . Our contributions went beyond the scope of L1 support: We reduced downtime through effective incident management and timely escalations. We built custom dashboards that improved visibility, monitoring, and reporting. We set up a process of continuous learning and adaptation to keep up with dynamic deployments. We documented and standardized incident handling, making future resolutions faster and more reliable. Most importantly, we became a trusted partner to our client — not just a support team. Our journey showcased that size is no barrier when expertise, dedication, and innovation come together. This success story is a testament to our team’s resilience, ability to learn, and determination to deliver value. It reinforced the fact that in today’s fast-moving technology landscape, reliability and trust are the cornerstones of any successful partnership.
Storybook and Chromatic for Fullstack Web Applications: Powering Development by Empowering Stakeholders
In a rapidly changing world, to keep the audience glued to websites, designs are frequently changed, and more features are constantly added. As such, rapid shipping is the norm. When designs change frequently, or new components get added, a system/workflow needs to be in place to enable such rapid changes. Our cases: Challenges: Enabling development of frontend components and viewing those components with multiple themes during development. Enabling component UI/UX review with multiple themes. Check and improve accessibility at component level. The CMS is used by stakeholders across all departments. Several departments lack a web development background. This means any new feature that we ship requires that we document so that the usage can be propagated across the stakeholders. So, an easy to use documentation. Multiple developers ship code, including interns, junior developers and senior developers. This adds a risk that a change in one place can cause regression at multiple pages across the website. Understanding that changing one component doesn’t cause any change in any other component with as little effort as possible. Understanding if the components are accessible and mobile responsive is equally important. Our solution: For every new component UI/UX design, We develop the frontend component using Storybook. Storybook gives us an isolated way of developing components. Using Storybook, at the component level we can understand the accessibility issues, if any, and address those. Build components that work well across different themes. We build & publish the Storybook to Chromatic. Chromatic enables us to understand what changed by checking the snapshots. This is useful to understand if any change has caused any regression. We get a UI/UX review within Storybook before merging the feature branch to target branch. On UI/UX approval, then build the same component on CMS. When the stakeholders managing the content access CMS and see the new components that could be used, they refer to the Storybook hosted on Chromatic to see if the component suits their content needs. When there are large additions, the stakeholders scroll through available components and identify components they could use on a particular page. Once a component is identified on the Storybook, they then check the Storybook docs to understand which field relates to which content on the component. Referring to this, they can easily add/update the content in CMS. This process enables the stakeholders to easily add/edit CMS content, without a difficult-to-read documentation. Storybook essentially serves as our component documentation here. With this, we don’t have to document anything component specific. We still document other aspects of the website like: Overall structure of website - an overview of our site in relation to the internal jargon. Details on identifying the content type within the CMS from the url structure of an existing page on the website. Specifics on pages that have content populated based on the relations created between different entities. The Storybook acts as a component documentation for the developers as well. That way, developers can use components across all products. Summary: Storybook + Chromatic combination enables component development, testing and documentation. It enables development and testing with different themes, accessibility and creating documentation with ease.
How to integrate External Toolchain (generated in Part-1) inside the target Linux image in BuildRoot — Part 3
For some months, we at WhileOne Techsoft Pvt. Ltd. have been helping our customer setup a system to validate the performance of their SoC platform. In this context, we had to bring up an aarch64 Linux based target image to run on their proprietary hardware SoC platform. Part -1 of this series explains how to build an external toolchain with BuildRoot. Part -2 of this series explains how to build a target Linux image using an external toolchain (that we built in Part -1 ) Part -3 of this series explains how to integrate the generated toolchain binaries in the target linux image that we are going to build with BuildRoot. Use Case : The reason for integrating external toolchain binaries (especially in an embedded system) is required to run benchmarks on the target. The benchmark software requires a toolchain to be available on the target! In the following steps we shall configure the Buildroot to use the External Toolchain tarball (refer Part -1) to build the Kernel, Rootfs images and copy the extracted tarball binaries to the target under directory /usr We have to modify the configuration and then rebuild using clean option. This will delete all output folders and files. So, its very important to move the tarball to some other location outside the ‘buildroot’ folder. If not yet done, please do this before proceeding ahead 1. Start by opening menuconfig make menuconfig We only modify a few options. Rest of the options will remain the same as was configured earlier. Target Options and Build Options shall remain same. So no need to change them 2. Modify Toolchain Options: a. Select option ‘External Toolchain’ b. Modify option of Toolchain origin to ‘Toolchain to be downloaded and installed’ c. Update URL to ‘file:///path/to/sdk-tarball’ d. Modify External Toolchain GCC version to ’11.x’ e. Modify External Toolchain kernel headers series to ‘5.4.x’ (This was same configuration that we had kept for tarball earlier..) f. Modify External Toolchain C library to ‘glibc/eglibc’ g. Disable ‘Toolchain has RPC support’ (Please disable if it was not selected earlier during tarball generation) h. Enable support for C++ and Fortran (This was enabled earlier for tarball configuration) a. Modify Init system from ‘None’ to ‘BusyBox’ b. Enable option ‘Use symlinks to /usr for /bin, /sbin and /lib c. Modify default BusyBox shell from ‘None’ to ‘/bin/sh/’ d. Update the folder name (gcc-11.x) for option ‘Root filesystem overlay directories’. We haven’t created this folder yet! Just add the name. 4. Create a folder (gcc-11.x) in root location of ‘buildroot’ 5. Extract the sdk tarball and copy contents to this folder 6. Inside folder gcc-11.x, create a folder named ‘usr’ and move the folders bin, lib and sbin to usr/ This step is very important because when BR2_ROOTFS_MERGED_USR (This flag corresponds to the option enabled in step 2 above) is enabled, then the overlay must not contain the /bin, /lib or /sbin directories, as Buildroot will create them as symbolic links to the relevant folders in /usr. In such a situation, should the overlay have any programs or libraries, they should be placed in /usr/bin, /usr/sbin and /usr/lib. The same should also be followed for other folders inside gcc-11.x. Once this is completed, then we can proceed ahead. If the above changes are not done, then we will encounter an RSYNC error. 7. Now, save and exit from menuconfig 8. Build with new configuration settings make clean all 9. The resulting images can be seen in the folder ‘output/images’ 10. Finally, verify that the Target folder contains the Toolchain binaries in path ‘output/target/usr/bin’, ‘output/target/usr/lib’, ‘output/target/usr/sbin’. This means that the final Linux Image will be having these binaries integrated in it. 11. The only way to verify is to boot the Linux image on your end and once booted and logged in, you need to ensure that you do not forget to set the LD_LIBRARY_PATH variable to point to the /usr/lib where the Toolchain libraries will be installed. Conclusion: We were able to successfully build an External Toolchain (generated as an SDK tarball) [This concludes our final Part in the BuildRoot series] A quote that has inspired me for a long time… “Obstacles don’t have to stop you. If you run into a wall, don’t turn around and give up. Figure out how to climb it, go through it, or work around it.” — Michael Jordan Part 1- https://www.whileone.in/post/how-to-create-an-external-toolchain-in-buildroot-part-1 Part 2 - https://www.whileone.in/post/external-toolchain-in-build-root-from-part-1-to-generate-rootfs-linux-part-2