Tuesday, 23 September 2025

🔧 Bitwise Essentials for Firmware: Macros, Flags, and Bit Spreading

Clean bit-twiddling is a reliability feature. Small helpers + predictable patterns beat ad-hoc shifts every time. Here are three tight patterns you’ll reuse everywhere.

🧷 Maintainable Bit Macros + Ordered Ops (Set 2&7, Clear 3, Toggle 5)

Goal: modify an 8-bit register with readable, testable macros.


#include <stdio.h>
#include <stdint.h>

#define BIT(b)            (1u << (b))
#define SET_BIT8(r,b)     ((r) = (uint8_t)((r) |  (uint8_t)BIT(b)))
#define CLEAR_BIT8(r,b)   ((r) = (uint8_t)((r) & (uint8_t)~BIT(b)))
#define TOGGLE_BIT8(r,b)  ((r) = (uint8_t)((r) ^  (uint8_t)BIT(b)))

static inline uint8_t modify_register(uint8_t reg) {
    SET_BIT8(reg, 2);     // set 2
    SET_BIT8(reg, 7);     // set 7
    CLEAR_BIT8(reg, 3);   // clear 3
    TOGGLE_BIT8(reg, 5);  // toggle 5
    return reg;
}

int main(void) {
    uint8_t reg;
    scanf("%hhu", &reg);
    printf("%u", modify_register(reg));
    return 0;
}

Notes: BIT(b) is unsigned; casts clamp to 8-bit. For dynamic positions, ensure b < 8.

🏷️ Decode Status Register → Human-Readable Flags (LSB→MSB)

Goal: map bits to names and print only enabled ones.


#include <stdio.h>
#include <stdint.h>

static const char * const flag_names[8] = {
    "Power On","Error","Tx Ready","Rx Ready",
    "Overheat","Undervoltage","Timeout","Reserved"
};

static void decode_status(uint8_t status_reg) {
    for (int i = 0; i < 8; ++i)
        if ((status_reg >> i) & 1u)
            printf("%s\n", flag_names[i]);
}

int main(void) {
    uint8_t reg;
    scanf("%hhu", &reg);
    decode_status(reg);
    return 0;
}

Why this style: LUT keeps meanings centralized; LSB→MSB aligns with datasheets.

🧩 Bit Spreading (Interleave Zeros): 8→16 with Even-Bit Placement

Goal: place each input bit at even positions (0,2,4,…) with zeros in odd positions.

Fast “dilate bits” version (branchless)


#include <stdio.h>
#include <stdint.h>

static inline uint16_t spread_bits(uint8_t x) {
    uint16_t v = x;
    v = (v | (v << 4)) & 0x0F0F;
    v = (v | (v << 2)) & 0x3333;
    v = (v | (v << 1)) & 0x5555;
    return v;
}

int main(void) {
    uint8_t val;
    scanf("%hhu", &val);
    printf("%u", spread_bits(val));
    return 0;
}

Why it’s useful: display pipelines, Morton/Z-order, IO packing, DSP simulators.

🧊 Myth vs Truth

Myth: “Bit hacks are unreadable.”
Truth: Small, named helpers + LUTs are clearer and safer than ad-hoc shifts.

🔌 Embedded Relevance

Stable macros reduce register bugs.
Flag decoding documents behavior (that won’t rot).
Bit spreading appears in protocols, graphics, and indexing.

✅ Conclusion

Treat bit ops like APIs: clear names, tight scopes, and deterministic patterns. You’ll get safer register code, self-documenting status handling, and reusable transforms that scale.

Written By: Musaab Taha

This article was improved with the assistance of AI.

Monday, 22 September 2025

The Physics of Debug-Later vs TDD (and Why Embedded Needs It)

Traditional “debug-later” (DLP) lets bugs age. As Td (time-to-discover) grows, Tfind (time-to-locate) explodes; Tfix often rises too as wrong assumptions accrete.

TDD collapses Td → 0, keeping Tfind + Tfix minimal and predictable.

🧪 DLP vs TDD: Cause & Effect

DLP: Long Td ⇒ context fades, dependencies pile up ⇒ Tfind spikes; Tfix grows when bad code becomes a foundation.
TDD: Td ≈ 0 (immediate test failure) ⇒ revert last change or make a pinpoint fix; defects are prevented, not shipped.

🔁 The TDD Microcycle (Red–Green–Refactor)

Add a small test (one behavior).
Run all tests → red (fail/doesn’t compile).
Implement the minimum to pass.
Run all tests → green.
Refactor (remove duplication, clarify intent), with tests guarding behavior.

Each loop: seconds to minutes. Feedback stays fresh; progress is measurable; breakages are obvious.

🧊 Myth vs Truth

Myth: “Write lots of tests, then lots of code.”
Truth: Tiny test → tiny code → refactor. Small steps keep Td near zero.

📈 TDD Benefits (Developer Reality)

Fewer bugs & regressions (fast feedback prevents drift).
Less debugging (defects die early).
Fewer side effects (tests codify assumptions/constraints).
Executable documentation (examples that don’t lie).
Design pressure (testable code ⇒ smaller, decoupled units).
Progress signal (green bar defines “done”).
It’s motivating (tight loops, visible wins).

🔌 Why Embedded Especially Benefits

Hardware independence early: verify production code before boards exist or when access is scarce.
Shorter target cycles: remove bugs on host; flash less, learn more.
Faster HW debug: isolate HW/SW boundaries with test doubles.
Better architecture: decoupling for testability reduces tight coupling to peripherals.

✅ Conclusion

Treat defects like radioactive isotopes, halve their life. By keeping Td near zero, TDD minimizes Tfind and Tfix, turns specs into executable tests, and makes embedded code reliable, predictable, and change-friendly. DLP institutionalizes waste; TDD institutionalizes feedback.

Written By: Musaab Taha

This article was improved with the assistance of AI.

References:

Test-Driven Development for Embedded C, James W. Grenning

FreeRTOS on STM32 - Manual Integration

“Build succeeded” isn’t the finish line. With RTOS work, the real traps are missing configs, handler clashes, and timer conflicts that pass compile but fail on hardware. Here’s a compact, reliable path.

😬 The Trap: “It Compiles!” but Doesn’t Run

Missing FreeRTOSConfig.h → silent defaults or hard errors
Duplicate exception handlers (SysTick/PendSV/SVC) → link conflicts
HAL vs FreeRTOS both using SysTick → time-base collision
Undefined shifts/hooks/heap setup → runtime weirdness

⚙️ Two Ways to Add FreeRTOS

Manual (portable skill): add kernel sources yourself; you control layout and learn what matters.
CubeIDE GUI (fast): Middleware → FreeRTOS (CMSIS-RTOS v1/v2) auto-adds kernel + CMSIS layer.

🛠 Manual, Minimal, Deterministic Setup

Create project in STM32CubeIDE (CubeMX-based).
Add sources under ThirdParty/FreeRTOS/:
- Copy: License, Source/ (incl. include/) and portable/
- Keep only portable/GCC/ARM_CM4F for F4 + FPU; delete other compilers/arches
Heap: keep heap_4.c; delete heap_1/2/3/5. Exclude sysmem.c (FreeRTOS provides its own heap mgmt).
Include paths (Project → Properties → C Compiler → Includes):
- .../FreeRTOS/Source/include
- .../FreeRTOS/Source/portable/GCC/ARM_CM4F
Add FreeRTOSConfig.h (start from an STM32F407 demo; then tailor). Ensure:
- extern uint32_t SystemCoreClock; enabled for your compiler (e.g., #if defined(__ICCARM__) || defined(__GNUC__) || defined(__CC_ARM)).
Resolve handler duplicates (in .ioc → System Core → NVIC → Code generation):
- Uncheck SVC, PendSV, and SysTick handlers (FreeRTOS provides them via macros).
- Regenerate.
Separate time bases:
- .ioc → SYS → HAL Time base source = TIM6 (reserve SysTick for FreeRTOS).
- NVIC priority grouping 4 bits preemption, 0 bits subpriority.
Start lean: in FreeRTOSConfig.h set initially
configUSE_TICK_HOOK=0, configUSE_MALLOC_FAILED_HOOK=0, configCHECK_FOR_STACK_OVERFLOW=0 (enable later with proper handlers).
Build → now you’re integrated cleanly.

🖱 CubeIDE One-Click Path (CMSIS-RTOS v2)

.ioc → Middleware > FreeRTOS → choose CMSIS-RTOS v2; set heap size.
Honor the IDE warnings:
- Change HAL time base to TIM6 (not SysTick).
- Enable Newlib reentrancy (Advanced settings → USE_NEWLIB_REENTRANT) for thread-safe libc in multitasking.
Generate code. You can use CMSIS APIs (e.g., osThreadNew) or native FreeRTOS APIs.

🧊 Myth vs Truth

Myth: “If it links, RTOS will run.”
Truth: RTOS needs correct handlers, time base, and config. Compile-time success doesn’t prove scheduler health.

🔌 Embedded Relevance

Deterministic bring-up removes heisenbugs before they reach hardware.
Clear separation of SysTick (RTOS) and HAL time base stabilizes timing.
Manual path teaches portable skills you can reuse on any MCU/IDE.

✅ Conclusion

Treat RTOS bring-up as a reliability exercise: one owner for critical handlers, one timer per role, and a known-good FreeRTOSConfig.h. Do that, and your “Hello, World” task isn’t luck—it’s repeatable, predictable, and production-ready.

Written By: Musaab Taha

This article was improved with the assistance of AI.

Bitwise Micro-Patterns for Embedded C (Reliable, Register-Safe)

For register work, “just shift & pray” leads to flaky bugs: off-by-one positions, undefined shifts, and masks that bleed into neighboring fields. Below are three tiny, composable patterns—each safe, predictable, and easy to review.

🧷 Pattern 1 — Is the Bit Set?

Goal: return 1 if bit at pos is 1, else 0.
Why: clean flag checks without branching.


#include <stdint.h>

static inline uint8_t is_bit_set_u8(uint8_t reg, uint8_t pos) {
    if (pos >= 8) return 0;                 // defensive bound
    return (uint8_t)((reg >> pos) & 1u);
}

Notes: Bound-check avoids accidental UB from oversized shifts; expression stays constant-time and side-effect free.

🔧 Pattern 2 — Set a Run of Bits (pos..pos+len-1) in a 32-bit Register

Goal: set a contiguous field to 1s without touching other bits.
Why: safe field enabling and mode configuration.


#include <stdint.h>

static inline uint32_t set_bits_u32(uint32_t reg, uint8_t pos, uint8_t len) {
    if (len == 0 || pos >= 32) return reg;
    if (len > 32 - pos) len = (uint8_t)(32 - pos);         // clamp instead of UB

    // Build mask in 64-bit to avoid (1U << 32) UB, then cast down.
    uint32_t mask = (uint32_t)(((uint64_t)1 << len) - 1u) << pos;
    return reg | mask;
}

Notes:

Using 64-bit during mask build sidesteps undefined behavior for len == 32.
Clamping guarantees pos + len ≤ 32.
Pure bit-ops, no branches after checks.

🏁 Pattern 3 — Keep Only the Highest Set Bit (uint16_t)

Goal: clear all bits except the MSB of the input.
Why: priority encode, normalize amplitudes, fast binning.


#include <stdint.h>

static inline uint16_t keep_highest_bit_u16(uint16_t x) {
    if (!x) return 0;
    x |= (uint16_t)(x >> 1);
    x |= (uint16_t)(x >> 2);
    x |= (uint16_t)(x >> 4);
    x |= (uint16_t)(x >> 8);
    return (uint16_t)(x - (x >> 1));   // leaves only the topmost 1
}

Notes: Branchless bit-trick; portable to fixed widths (extend with more shifts for wider types).

🧊 Myth vs Truth

Myth: “Bit-twiddling is fast even if a little sloppy.”
Truth: One undefined shift or leaky mask can corrupt adjacent fields—precision is performance.

🔌 Embedded Relevance

Deterministic, reviewable code for registers and flags.
UB-free masks → fewer Heisenbugs during HW bring-up.
Easy to wrap into driver libraries and unit-test.

✅ Conclusion

Small, defensive bitwise patterns pay off: no undefined shifts, no accidental field clobbering, and behavior that stays predictable under pressure. Treat masks like APIs—validate inputs, avoid UB, and your register code becomes boringly reliable (the best kind).

Written By: Musaab Taha

This article was improved with the assistance of AI.

Sunday, 21 September 2025

Why Do We Need Test-Driven Development (TDD)?

For years, traditional software development followed the “build first, debug later” model. Code was written, compiled, and only then tested. Debugging consumed nearly half of the development cycle—an unpredictable, risky activity where fixing one bug often created new ones.

🐞 Debug-Later Programming (DLP)

Code → then test → then debug.
Late feedback meant bugs lingered until integration.
Regression test suites helped, but surprises still emerged.
Debugging was costly and schedule-killing.

A famous example: the Zune bug (2008). On December 31st of a leap year, millions of devices bricked due to a subtle date-handling error. A single automated test for the leap-year edge case could have prevented it.

⚡ Enter Test-Driven Development (TDD)

TDD flips the process:

Write a failing test describing the behavior.
Write the minimal code to make the test pass.
Refactor and repeat.

With TDD:

Tests are automated and run continuously.
Unintended consequences are caught immediately.
Feedback is fast, design decisions are clearer, and debugging overhead shrinks dramatically.
The growing test suite becomes as valuable as the production code.

🌍 Why Embedded Systems Need TDD

Jack Ganssle put it best:

“The only reasonable way to build an embedded system is to start integrating today. Test and integration are the very fabric of development.”

For embedded and real-time systems, unknowns are the biggest schedule killers. TDD weaves testing into development itself—like Kevlar for your code.

✅ Conclusion

TDD is not just about testing—it’s about designing with safety nets. It provides immediate feedback, prevents regressions, and makes software robust and predictable. In domains like embedded systems, where a missed deadline or hidden bug can be catastrophic, TDD transforms uncertainty into reliability.

Written By: Musaab Taha

This article was improved with the assistance of AI.

Understanding Real-Time Systems vs. General-Purpose OS

When we hear real-time computing, it’s often misunderstood as simply “fast.” The truth? Real-time is about predictability—guaranteeing deadlines, not just raw performance.

🔍 Real-Time Applications (RTA)

Myth: Real-time = faster execution.
Truth: Real-time = deterministic response within strict deadlines.
Types:
- Hard real-time: Missing a deadline = total failure (e.g., airbag deployment, ABS braking).
- Soft real-time: Occasional deadline misses tolerated (e.g., VoIP calls).

RTAs keep response times almost constant across iterations, unlike non-RTAs where response time varies with load.

⚙️ Real-Time Operating System (RTOS) vs. General-Purpose OS (GPOS)

RTOS: Deterministic, bounded interrupt & scheduling latency, priority-based preemptive scheduling.
GPOS: Focuses on throughput & fairness (Linux, Windows, macOS). Great for desktops, not for strict deadlines.

Key differences:

Scheduling:
- RTOS → always favors high-priority tasks.
- GPOS → fairness policy, throughput focus.
Latency:
- RTOS → bounded & predictable.
- GPOS → varies with system load.
Priority inversion:
- RTOS → mitigated with techniques like priority inheritance.
- GPOS → usually ignored, no critical impact.

🔄 Multitasking

In embedded systems, multitasking often runs on a single core. The scheduler slices CPU time across tasks (e.g., sensor read, display update, button handling), giving the illusion of parallelism. On multi-core desktops, tasks can run truly in parallel, but embedded systems mostly rely on smart scheduling.

✅ Conclusion

Real-time systems are not about speed—they are about meeting deadlines predictably. RTOSs trade throughput for determinism, making them vital for safety-critical applications like automotive, aerospace, and medical systems.

Written By: Musaab Taha

This article was improved with the assistance of AI.

Bit Manipulation in C: Set/Clear, Toggle, and Test Bits (With Tiny, Fast Utilities)

Working close to hardware means getting comfy with bits. Below are three tiny, production-ready C helpers you can drop into any project to set/clear a bit, toggle a specific bit, and check whether a bit is set. Each runs in constant time and uses simple masks—perfect for firmware, drivers, and competitive programming.

1) Set or Clear a Specific Bit (8-bit Register)

Goal: Given an 8-bit register value, a bit position (0–7), and a mode (1=set, 0=clear), return the updated value.


#include <stdio.h>
#include <stdint.h>

static inline uint8_t modify_bit(uint8_t reg, int pos, int mode) {
    if (mode) {
        reg |= (uint8_t)(1u << pos);      // set bit
    } else {
        reg &= (uint8_t)~(1u << pos);     // clear bit
    }
    return reg;
}

int main(void) {
    uint8_t reg;
    int pos, mode;

    if (scanf("%hhu %d %d", &reg, &pos, &mode) != 3) return 0;

    // (Optional) basic guardrails — ignore invalid input
    if (pos < 0 || pos > 7 || (mode != 0 && mode != 1)) {
        return 0;
    }

    printf("%u", (unsigned)modify_bit(reg, pos, mode));
    return 0;
}

How it works:

1u << pos builds a mask for the target bit.
OR (|=) sets the bit; AND with NOT (&= ~) clears it.
Using uint8_t keeps the operation strictly 8-bit.

2) Toggle the 5th Bit (0-based)

Goal: Flip bit at position 5. XOR is your friend.


#include <stdio.h>

static inline int toggle_fifth_bit(int n) {
    return n ^ (1 << 5);   // XOR with 0b0010_0000 (32)
}

int main(void) {
    int n;
    if (scanf("%d", &n) != 1) return 0;
    printf("%d", toggle_fifth_bit(n));
    return 0;
}

Why XOR?

x ^ 1 flips a bit; x ^ 0 leaves it unchanged.
(1 << 5) targets only the 5th bit, so everything else stays intact.

Complexity: O(1) time, O(1) space.

3) Check if K-th Bit Is Set

Goal: Print 1 if the K-th bit of N is 1; else 0.


#include <stdio.h>

static inline int is_kth_bit_set(int n, int k) {
    return (n & (1 << k)) ? 1 : 0;
}

int main(void) {
    int n, k;
    if (scanf("%d %d", &n, &k) != 2) return 0;
    printf("%d", is_kth_bit_set(n, k));
    return 0;
}

Why AND?

n & (1 << k) isolates that single bit. Non-zero → set; zero → clear.

Quick Notes & Best Practices

Prefer fixed-width types (uint8_t, uint32_t) for register-like code.
Validate input ranges when reading from users or untrusted sources.
When targeting 8-bit registers, cast masks to uint8_t to avoid surprises during promotion.

Conclusion

Bit operations are the bread-and-butter of embedded and systems work. With three tiny helpers—modify, toggle, and test—you can safely manipulate registers and flags with clean, constant-time code. Keep these patterns handy; they scale from toy examples to real device drivers.

Written By: Musaab Taha

This article was improved with the assistance of AI.