top of page


CPU-Centric HPC Benchmarking with miniFE and GROMACS
Benchmarks are vital for evaluating High-Performance Computing (HPC) system performance, guiding hardware choices, and optimizing software. This whitepaper focuses on understanding and overcoming bottlenecks in HPC benchmarks for CPU environments, specifically considering ARM/AARCH64 architectures, using miniFE and GROMACS as examples. 1. Introduction to miniFE and GROMACS Benchmarks 1.1. miniFE: A Finite Element Mini-Application miniFE, part of the Mantevo suite, simulates i

Rahul Bapat
Jun 16, 20255 min read


To get maximum tokens generated for target CPU
LLMs are Getting Better and Smaller Let’s look at Llama as an example. The rapid evolution of these models highlights a key trend in AI: prioritizing efficiency and performance. When Llama 2 70B launched in August 2023, it was considered a top-tier foundational model. However, its massive size demanded powerful hardware like the NVIDIA H100 accelerator. Less than nine months later, Meta introduced Llama 3 8B, shrinking the model by almost 9x. This enabled it to run on smaller

Archana Barve
Jun 9, 20251 min read


Why Every Company Needs Robust Demos And How WhileOne Can Help
Building a great product is only half the battle. Demonstrating its capabilities convincingly — whether in front of customers, at an exhibition, or during a PoC — is often what seals the deal. Yet, for many companies, setting up demos ends up as a side project that falls through the cracks. At Whileone , we understand this challenge. That’s why helping companies build reliable, repeatable demos has been part of our mission since day one. The Problem with Ad-Hoc Demos Most com

Rajeev Gadgil
Jun 2, 20252 min read


Benchmarking Meta Llama 4 Scout on CPU-Only Systems: Performance, Quantization, and Architecture Tuning
Meta’s Llama 4 Scout, released in April 2025, is a 17-billion parameter general-purpose language model that brings powerful reasoning to a broader range of applications—including those running without GPUs. This blog focuses on benchmarking Llama 4 Scout on CPU-only systems, covering: Tokens per second Latency per token Prompt handling efficiency Quantization techniques Architecture-specific optimization for x86, ARM, and RISC-V (RV64) Converting to GGUF format for efficient

Rajeev Gadgil
May 26, 20253 min read


Migrating JetStream 2.2 to Node.js: Challenges, Design, and What We Learned
JetStream is a JavaScript benchmark suite that evaluates web application performance by measuring the execution latency and throughput of complex workloads. With the release of JetStream 2.2, we at WhileOne Techsoft undertook the task of migrating its harness to a modern Node.js-based setup . Recently while working with a customer who was looking to benchmark their CPU using some js workloads. This post dives into why we did it, how we did it, and what you can expect from th

Rajeev Gadgil
May 19, 20253 min read


Cross-Compiling SPEC CPU2017 for RISC-V (RV64): A Practical Guide
SPEC CPU2017 is a well-known benchmark suite for evaluating CPU-intensive performance. Although it assumes native compilation and execution, there are cases—especially with RISC-V (RV64) platforms—where cross-compilation is the only feasible route. This guide walks through the steps to cross-compile SPEC CPU2017 for RISC-V, transfer the binaries to a target system, and optionally use the --fake option to simulate runs where execution isn't possible or needed during develop

Rajeev Gadgil
May 12, 20253 min read


Uncovering the Best: 5 Top Tools for Cutting-Edge Chip Benchmarking
In the fast-paced world of technology, chip benchmarking is vital. It helps engineers and developers measure the performance of semiconductor devices to keep up with advancements. This post dives into the top five tools for chip benchmarking, highlighting their features, benefits, and real-world applications. 1. Geekbench Geekbench stands out as a cross-platform benchmarking tool for assessing CPU and GPU performance. Its versatility allows it to work seamlessly across differ

Nandita Gadgil
Apr 21, 20253 min read


Performance Modelling: How to Predict and Optimize System Efficiency
1. Introduction In today’s fast-paced digital world, system performance is critical to the success of applications ranging from cloud computing platforms to high-performance computing (HPC) workloads. Performance modelling is a powerful technique used to predict, analyze, and optimize the efficiency of computing systems. By simulating and understanding system behavior, developers, engineers, and IT managers can make informed decisions about design, scaling, and optimization s

Nandita Gadgil
Apr 14, 20252 min read


Understanding SPEC HPC Benchmarks: A Comprehensive Guide for Beginners
1. Introduction High-Performance Computing (HPC) is at the core of solving complex computational problems in scientific research, engineering, and large-scale data analysis. Benchmarking plays a critical role in evaluating and optimizing HPC system performance. The Standard Performance Evaluation Corporation (SPEC) provides widely recognized benchmarking suites tailored for different computing environments, helping researchers, businesses, and hardware vendors assess system c

Nandita Gadgil
Apr 7, 20252 min read


Automating Web Application Deployment on AWS EC2 with GitHub Actions
Introduction Deploying web applications manually can be time-consuming and error-prone. Automating the deployment process ensures consistency, reduces downtime, and improves efficiency. In this blog, we will explore how to automate web application deployment on AWS EC2 using GitHub Actions. By the end of this guide, you will have a fully automated CI/CD pipeline that pushes code from a GitHub repository to an AWS EC2 instance, ensuring smooth and reliable deployments. Seamles

Sameer Natu
Mar 17, 20253 min read


How CloudNudge Can Help You Optimize and Manage Your Cloud Expenses
Introduction Managing cloud costs is a growing challenge for software and hardware engineers. As cloud services expand, expenses can quickly spiral out of control without proper oversight. Engineers need a cloud cost management tool like CloudNudge to monitor, optimize, and reduce cloud spending efficiently. In this blog, we will explore why cloud cost management is essential, how specialized tools can help, and best practices for using them effectively. The Importance of

Nandita Gadgil
Mar 10, 20252 min read


Mastering the 5 Essential Performance Engineering Skills for Software Engineers: A Professional Guide
Performance engineering is a vital area in software development that guarantees applications function efficiently and effectively. As modern software systems grow more complex, the need for skilled engineers who understand performance becomes increasingly important. This guide will cover five essential performance engineering skills every software engineer should develop to thrive in their careers. Grasping Performance Requirements To start, software engineers must excel at u

Archana Barve
Mar 3, 20253 min read


Which Cloud Provider is best for you? A pricing and performance Breakdown
Cloud computing has become the backbone of modern businesses, with AWS, Google Cloud (GCP), Microsoft Azure, and Oracle Cloud Infrastructure (OCI) leading the market. Choosing the right cloud provider depends on various factors like pricing, performance, scalability, security, and real-world use cases. In this blog, we’ll break down these aspects to help you make an informed decision, using detailed tables, graphs, and deep insights. 1. Cloud Provider: Pricing Comparison Pric

Vishvanath Metkari
Feb 24, 20253 min read


Performance Testing with NeoLoad: A Detailed Exploration of WebFocus Use Case
In today’s software-driven world, ensuring the seamless performance of applications under varying workloads is a necessity. For performance testing, tools like NeoLoad empower testers to simulate real-world conditions and optimize application behavior. In this blog, we’ll delve into the practical use of NeoLoad for WebFocus performance testing , focusing on scenarios like chart rendering, page loads, data uploads, and resource utilization. Understanding the Scope of Performa
Manasi Bansode
Jan 27, 20253 min read


Debugging Tool for workloads using Java
A debugging tool, GCeasy is used while generating reports of performance engineering.

Samruddhi Gole
Oct 28, 20242 min read


AWS Lambda to generate SSH Keys
For the past few months, my team and I at WhileOne Techsoft Pvt. Ltd. have been helping our customer setup a system wherein access to a remote server in the cloud for testing can be granted to users. One of our client’s requirements is to generate SSH keys from the JIRA board. In JIRA use a custom script to generate SSH keys which will help our client for project automation. SSH key pairs are two cryptographically secure keys that can be used to authenticate a client to an S

Mrinal Kshirsagar
Nov 27, 20234 min read


Investigating Performance Discrepancy in HPL Test on ARM64 Machines
Introduction: High-Performance Linpack (HPL) is a widely used benchmark for testing the computational performance of computing systems. In this blog post, we explore an intriguing scenario where we conducted HPL tests on two ARM64 machines. Surprisingly, the Host-2 machine exhibited a 20% lower performance than the Host-1 machine in the HPL test. Intrigued by this result, we embarked on a journey to comprehensively diagnose the underlying cause of this performance discrepancy

Vishvanath Metkari
Aug 1, 20235 min read


Server Performance Prediction using ML Models - Part 2
In the first part of the blog, we described the problem that we intend to solve, the data gathering, post processing, and generating the final training data. In the 2nd part, we will take a look at the Machine Learning model we used for training and for inference with new data. Correlation between various counters We have captured various counters for various benchmarks. Here is a graph that shows the correlation between each counter with every other counter. K Neighbors Re

Rajeev Gadgil
Jul 26, 20232 min read


Server Performance Prediction using ML Models - Part 1
This blog is the first part of the series in "Server Performance Prediction using Machine Learning Models" OVERVIEW: In the semiconductor industry, a Silicon Cycle takes nearly a year or more from conception to the silicon being available. As soon as the concept comes in, there is immense pressure on the marketing and sales teams to come up with performance numbers for this new generation of the silicon i.e. processor. There is a need for a way to find out what the score coul

Rajeev Gadgil
Jul 20, 20237 min read


GCP Cloud Performance: Time-Based Score Variations
In May 2022, one of our customers asked us to tune Elasticsearch with Esrally for cloud providers. We started with trying multiple combinations of manual runs on all cloud providers. We were collecting scaling runs with 2/4/8/16 cores. In the above data collection, we could not see the proportionate scores. Hence, we decided to experiment with running the Elasticsearch ESRally benchmark throughout the day. As Esrally doesn’t run for a particular duration, we carried out the r

Archana Barve
Apr 3, 20231 min read
bottom of page

