9 Commits

5 changed files with 340 additions and 0 deletions

View File

@@ -0,0 +1,5 @@
# Quality Description
Writing tests first forced every component to be injectable and independently exercisable before any integration happened. That constraint turned out to matter more than expected when the race-condition tests were added at the end: because the producer, sysfs reader, and IPC bridge had already been broken into units with explicit interfaces (`std::function` callbacks, injected sleep, injected logger), the stress tests could be wired up without touching any production code. Nothing needed to be refactored to be testable — it already was. That is the practical benefit of TDD for concurrent embedded software: the discipline of writing the test first tends to eliminate shared mutable state and deep call chains by making them painful to test, which in turn reduces cyclomatic complexity almost as a side effect.

25
docs/self-assessment.md Normal file
View File

@@ -0,0 +1,25 @@
# Self-Assessment
## Two Real Difficulties
**1. Maintaining TDD discipline under time pressure**
Sticking to a strict test-first workflow throughout the session was genuinely hard. Between the deadline and the accumulated fatigue of a full day of work beforehand, there were moments where the temptation to just write the implementation and then fill the tests was real. I did not always resist it. Some tests were written after the fact rather than before, which is something I am aware of and want to be honest about.
**2. Designing testable seams at the IPC and sysfs boundaries**
The components that most needed testing were also the ones most coupled to external resources: a live socket and a real sysfs path. The difficulty was finding the right abstraction level, too thin and the tests require actual kernel resources; too thick and you end up testing your mocks, not your logic. The solution was to inject the transport as a plain `std::function` callback into the producer, and to point the sysfs reader at a controlled fake file on disk. Both approaches keep the core logic testable with no sockets, no threads, and no Qt, but arriving at that boundary (deciding what to abstract and what to leave concrete) required more iteration than I anticipated.
---
## Alternative IPC Mechanism Considered
I evaluated POSIX shared memory with semaphores as an alternative to UNIX domain sockets. The theoretical appeal is clear: no serialization, no kernel-mediated data copy, potentially lower latency. However, I am considerably less practiced with `shm_open`/`mmap`/`sem_post` than I am with socket-based communication, and more importantly, shared memory is significantly harder to unit-test in isolation. Sockets expose a clean, file-descriptor-based interface that maps naturally to mock-able abstractions. Shared memory regions and semaphore lifecycles would have added complexity to the test harness for uncertain gain at this data rate. Domain sockets were the pragmatic choice.
---
## Design Decision Changed Mid-Development
Initially I had planned a looser boundary between the core logic and Qt, with the producer potentially depending on Qt primitives for threading or signalling. Early on, I decided to keep Qt strictly confined to the GUI layer and the consumer thread, nothing more. The producer, the sysfs reader, and the IPC bridge are plain C++ with no Qt dependency whatsoever.
The reason is simple: that code could be portable. If tomorrow the producer needs to run on a microcontroller, a bare-metal embedded target, or any environment where Qt is not available or not desirable, the only thing that needs replacing is the transport callback. The core logic moves untouched. It also makes unit-testing the producer significantly cleaner and easier, no Qt test infrastructure needed, just standard C++.

View File

@@ -73,3 +73,18 @@ target_link_libraries(test_main_window
)
add_test(NAME test_main_window COMMAND test_main_window)
add_executable(test_race_conditions
test_race_conditions.cxx
)
target_link_libraries(test_race_conditions
PRIVATE
core
gtest
gtest_main
Qt5::Core
Qt5::Test
)
add_test(NAME test_race_conditions COMMAND test_race_conditions)

View File

@@ -1,7 +1,11 @@
#include <gtest/gtest.h>
#include <sys/socket.h>
#include <sys/un.h>
#include <unistd.h>
#include <QCoreApplication>
#include <QSignalSpy>
#include <cstring>
#include "Consumer.hpp"
#include "UnixIpcBridge.hpp"
@@ -118,3 +122,94 @@ TEST(ConsumerThreadTest, StopsCleanlyWhenNeverStarted)
// stop() on a consumer that was never started must not crash
consumer.stop();
}
// ---------------------------------------------------------------------------
// Requirement 2: Consumer receiving corrupted data (non-numeric)
// ---------------------------------------------------------------------------
/// Helper: raw-connect to a UNIX socket and send arbitrary bytes.
static void send_raw_bytes(const std::string& path, const void* data,
size_t len)
{
int fd = socket(AF_UNIX, SOCK_STREAM, 0);
ASSERT_GE(fd, 0);
struct sockaddr_un addr = {};
addr.sun_family = AF_UNIX;
std::strncpy(addr.sun_path, path.c_str(), sizeof(addr.sun_path) - 1);
ASSERT_EQ(
connect(fd, reinterpret_cast<sockaddr*>(&addr), sizeof(addr)), 0);
if (len > 0)
{
::send(fd, data, len, 0);
}
close(fd);
}
TEST(ConsumerThreadTest, DropsCorruptedShortMessage)
{
const std::string sock = "/tmp/test_ct_corrupt_short.sock";
ConsumerThread consumer(sock);
QSignalSpy spy(&consumer, &ConsumerThread::valueReceived);
consumer.start();
// Send only 2 bytes instead of sizeof(int)==4 — corrupted / partial message
uint16_t garbage = 0xBEEF;
send_raw_bytes(sock, &garbage, sizeof(garbage));
// Give the consumer time to process (or not)
spy.wait(500);
consumer.stop();
// No signal should have been emitted
EXPECT_EQ(spy.count(), 0);
}
TEST(ConsumerThreadTest, DropsEmptyConnection)
{
const std::string sock = "/tmp/test_ct_corrupt_empty.sock";
ConsumerThread consumer(sock);
QSignalSpy spy(&consumer, &ConsumerThread::valueReceived);
consumer.start();
// Connect and immediately close — zero bytes sent
send_raw_bytes(sock, nullptr, 0);
spy.wait(500);
consumer.stop();
EXPECT_EQ(spy.count(), 0);
}
TEST(ConsumerThreadTest, SurvivesCorruptedThenReceivesValid)
{
const std::string sock = "/tmp/test_ct_corrupt_then_valid.sock";
ConsumerThread consumer(sock);
QSignalSpy spy(&consumer, &ConsumerThread::valueReceived);
consumer.start();
// First: send corrupted (1 byte)
uint8_t one_byte = 0xFF;
send_raw_bytes(sock, &one_byte, sizeof(one_byte));
std::this_thread::sleep_for(std::chrono::milliseconds(50));
// Then: send a valid int via the normal bridge
UnixIpcBridge bridge(sock);
bridge.send(777);
// Wait for the valid signal
for (int attempt = 0; spy.count() < 1 && attempt < 20; ++attempt)
{
spy.wait(100);
}
consumer.stop();
// The corrupted message must have been dropped, valid one received
ASSERT_EQ(spy.count(), 1);
EXPECT_EQ(spy.at(0).at(0).toInt(), 777);
}

View File

@@ -0,0 +1,200 @@
// test_race_conditions.cxx
// SPDX-License-Identifier: GPL-3.0-or-later
// Author: Unai Blazquez <unaibg2000@gmail.com>
#include <gtest/gtest.h>
#include <sys/socket.h>
#include <sys/un.h>
#include <unistd.h>
#include <QCoreApplication>
#include <QSignalSpy>
#include <atomic>
#include <chrono>
#include <fstream>
#include <iostream>
#include <stdexcept>
#include <thread>
#include <vector>
#include "Consumer.hpp"
#include "Producer.hpp"
#include "UnixIpcBridge.hpp"
static int argc_ = 0;
static QCoreApplication app_(argc_, nullptr);
TEST(RaceConditionTest, RepeatedStartStopWhileProducerSends)
{
const std::string sock = "/tmp/test_race.sock";
constexpr int kCycles = 20;
// Watchdog: if the test takes longer than 15s, declare deadlock.
std::atomic<bool> test_done{false};
std::thread watchdog([&test_done]() {
for (int i = 0; i < 150 && !test_done.load(); ++i)
{
std::this_thread::sleep_for(std::chrono::milliseconds(100));
}
if (!test_done.load())
{
std::cerr
<< "DEADLOCK DETECTED: RepeatedStartStopWhileProducerSends timed out"
<< std::endl;
std::abort();
}
});
std::atomic<bool> producer_running{true};
std::thread producer([&]() {
while (producer_running.load())
{
try
{
UnixIpcBridge bridge(sock);
bridge.send(42);
}
catch (const std::runtime_error&)
{
// Expected: consumer socket not ready or just torn down.
}
std::this_thread::sleep_for(std::chrono::milliseconds(5));
}
});
for (int i = 0; i < kCycles; ++i)
{
ConsumerThread consumer(sock);
consumer.start();
// Let it run briefly so the producer can connect during some cycles.
std::this_thread::sleep_for(std::chrono::milliseconds(10 + (i % 5) * 5));
// stop() must return without deadlock every single time.
consumer.stop();
}
producer_running.store(false);
producer.join();
test_done.store(true);
watchdog.join();
// If we reach here, no deadlock across kCycles start/stop cycles.
SUCCEED();
}
TEST(RaceConditionTest, ProducerSurvivesConsumerCrash)
{
const std::string sock = "/tmp/test_crash.sock";
const std::string sysfs = "./fake_sysfs_race";
// Prepare sysfs file so the producer is in Enabled state.
{ std::ofstream(sysfs) << "1\n"; }
// Track what the producer sends.
std::vector<int> sent_values;
std::mutex sent_mutex;
std::vector<std::string> logs;
std::mutex log_mutex;
auto make_safe_send = [&](const std::string& path) {
return [&, path](int value) {
try
{
UnixIpcBridge bridge(path);
bridge.send(value);
std::lock_guard<std::mutex> lk(sent_mutex);
sent_values.push_back(value);
}
catch (const std::runtime_error&)
{
// Consumer is down — expected during the "crash" window.
}
};
};
Producer producer(
sysfs, make_safe_send(sock), []() { return 123; },
[&](const std::string& msg) {
std::lock_guard<std::mutex> lk(log_mutex);
logs.push_back(msg);
},
[](std::chrono::milliseconds) {
// Use a short sleep so the test runs fast.
std::this_thread::sleep_for(std::chrono::milliseconds(20));
});
// Phase 1: start consumer, start producer, let a few values flow.
{
ConsumerThread consumer(sock);
QSignalSpy spy(&consumer, &ConsumerThread::valueReceived);
consumer.start();
producer.start();
// Wait for at least 2 values to arrive.
for (int attempt = 0; spy.count() < 2 && attempt < 50; ++attempt)
{
spy.wait(100);
}
ASSERT_GE(spy.count(), 2) << "Phase 1: producer should have delivered values";
// Simulate a hard crash: force-close the consumer's server fd from
// outside its thread, causing accept() to fail with EBADF. This is
// what happens when the kernel reclaims fds on SIGKILL / abort().
//
// We find the server fd by calling getsockname() on open fds and
// matching against our socket path.
for (int fd = 3; fd < 1024; ++fd)
{
struct sockaddr_un addr = {};
socklen_t len = sizeof(addr);
if (getsockname(fd, reinterpret_cast<sockaddr*>(&addr), &len) == 0 &&
addr.sun_family == AF_UNIX &&
std::string(addr.sun_path) == sock)
{
::close(fd); // Yank the fd — consumer thread crashes out of accept()
break;
}
}
// Destructor calls stop(), which joins the (now-exited) thread and
// cleans up. In a real crash no cleanup runs, but we can't leak
// threads in a test process.
}
// Phase 2: producer is still running with no consumer (sends will fail).
std::this_thread::sleep_for(std::chrono::milliseconds(200));
// Phase 3: bring up a fresh consumer. Producer should resume delivering.
{
ConsumerThread consumer2(sock);
QSignalSpy spy2(&consumer2, &ConsumerThread::valueReceived);
consumer2.start();
for (int attempt = 0; spy2.count() < 2 && attempt < 50; ++attempt)
{
spy2.wait(100);
}
consumer2.stop();
ASSERT_GE(spy2.count(), 2)
<< "Phase 3: producer must deliver to a new consumer after crash";
// Values received by the second consumer should all be 123.
for (int i = 0; i < spy2.count(); ++i)
{
EXPECT_EQ(spy2.at(i).at(0).toInt(), 123);
}
}
producer.stop();
// Producer logged throughout all three phases.
{
std::lock_guard<std::mutex> lk(log_mutex);
EXPECT_GE(logs.size(), 3u) << "Producer should have kept logging";
}
}