Requirements Document

Introduction

This document specifies the requirements for a Go implementation of a distributed search cluster system. The system enables distributed document search using the TF-IDF (Term Frequency-Inverse Document Frequency) algorithm across multiple worker nodes, coordinated through Apache ZooKeeper for leader election and service discovery. The system supports automatic failover, dynamic scaling of worker nodes, and efficient parallel document processing.

Glossary

Cluster: A group of interconnected nodes working together to provide distributed search functionality
Node: A single instance of the application that can act as either a Coordinator or Worker
Coordinator: The elected leader node that receives search queries, distributes work to workers, and aggregates results
Worker: A follower node that processes document search tasks and returns term frequency data
ZooKeeper: Apache ZooKeeper, a distributed coordination service used for leader election and service registry
Znode: A data node in ZooKeeper's hierarchical namespace
Ephemeral_Znode: A znode that exists only while the session that created it is active
Sequential_Znode: A znode with a monotonically increasing counter appended to its name
Leader_Election: The process by which nodes compete to become the Coordinator
Service_Registry: A ZooKeeper-based registry where nodes register their HTTP endpoints for discovery
TF-IDF: Term Frequency-Inverse Document Frequency, an algorithm for measuring document relevance
Term_Frequency: The ratio of a term's occurrences to total words in a document
Inverse_Document_Frequency: A measure of how much information a term provides across all documents
Task: A work unit containing search terms and document paths sent from Coordinator to Worker
Result: The response from a Worker containing document data with term frequencies
DocumentData: A mapping of search terms to their frequencies within a single document
Protobuf: Protocol Buffers, a binary serialization format used for external API communication
HTTP_Endpoint: A URL path that accepts HTTP requests for specific functionality

Requirements

Requirement 1: ZooKeeper Connection Management

User Story: As a cluster node, I want to establish and maintain a connection to ZooKeeper, so that I can participate in leader election and service discovery.

Acceptance Criteria

WHEN the application starts, THE Node SHALL connect to ZooKeeper at a configurable address with a configurable session timeout
WHEN the ZooKeeper connection is established, THE Node SHALL log a success message indicating connection status
WHEN the ZooKeeper connection is lost, THE Node SHALL notify internal components and initiate graceful shutdown
WHILE connected to ZooKeeper, THE Node SHALL maintain the session through automatic heartbeats handled by the ZooKeeper client library
WHEN the application terminates, THE Node SHALL close the ZooKeeper connection gracefully

Requirement 2: Leader Election

User Story: As a cluster node, I want to participate in leader election, so that exactly one node becomes the Coordinator while others become Workers.

Acceptance Criteria

WHEN a Node joins the cluster, THE Leader_Election SHALL create an ephemeral sequential znode under the /election namespace with prefix c_
WHEN determining leadership, THE Leader_Election SHALL retrieve all children of /election, sort them lexicographically, and identify the smallest znode as the leader
WHEN a Node's znode is the smallest, THE Leader_Election SHALL invoke the onElectedToBeLeader callback to transition the node to Coordinator role
WHEN a Node's znode is not the smallest, THE Leader_Election SHALL invoke the onWorker callback and set a watch on its immediate predecessor znode
WHEN a watched predecessor znode is deleted, THE Leader_Election SHALL trigger re-election to determine if the node should become the new leader
IF the predecessor znode no longer exists during watch setup, THEN THE Leader_Election SHALL retry the election process

Requirement 3: Service Registry

User Story: As a cluster node, I want to register my HTTP endpoint in a service registry, so that other nodes can discover and communicate with me.

Acceptance Criteria

WHEN the Service_Registry is initialized, THE Service_Registry SHALL create the registry parent znode if it does not exist
THE Service_Registry SHALL support two separate registries: /workers_service_registry for Workers and /coordinators_service_registry for Coordinators
WHEN a Node registers to the cluster, THE Service_Registry SHALL create an ephemeral sequential znode containing the node's HTTP endpoint address
WHEN a Node is already registered, THE Service_Registry SHALL prevent duplicate registration and log a warning
WHEN a Node unregisters from the cluster, THE Service_Registry SHALL delete its znode from the registry
WHEN a Node requests all service addresses, THE Service_Registry SHALL return a list of all registered endpoint addresses
WHEN a Node registers for updates, THE Service_Registry SHALL set a watch on the registry and update the cached address list when children change
WHEN the registry children change, THE Service_Registry SHALL automatically refresh the cached list of service addresses

Requirement 4: Role Transition

User Story: As a cluster node, I want to transition between Coordinator and Worker roles based on election results, so that the cluster can dynamically adapt to node changes.

Acceptance Criteria

WHEN a Node is elected as leader, THE Node SHALL unregister from the workers registry and register in the coordinators registry
WHEN a Node is elected as leader, THE Node SHALL start an HTTP server with the /search endpoint and stop any existing worker server
WHEN a Node becomes a Worker, THE Node SHALL register in the workers registry with its /task endpoint address
WHEN a Node becomes a Worker, THE Node SHALL start an HTTP server with the /task endpoint if not already running
WHEN transitioning from Worker to Coordinator, THE Node SHALL stop the worker HTTP server before starting the coordinator server
WHEN a Node is elected as leader, THE Node SHALL register for updates on the workers registry to track available workers

Requirement 5: HTTP Server

User Story: As a cluster node, I want to run an HTTP server, so that I can receive health checks and process requests appropriate to my role.

Acceptance Criteria

THE HTTP_Server SHALL expose a /status endpoint that responds to GET requests with a health check message
THE HTTP_Server SHALL expose a dynamic endpoint (/search or /task) based on the node's current role
WHEN a POST request is received on the role-specific endpoint, THE HTTP_Server SHALL pass the request body to the appropriate handler and return the response
WHEN a non-POST request is received on the role-specific endpoint, THE HTTP_Server SHALL reject the request
THE HTTP_Server SHALL use a configurable port number provided at startup
THE HTTP_Server SHALL handle concurrent requests using goroutines
WHEN the HTTP_Server is stopped, THE HTTP_Server SHALL gracefully shutdown allowing in-flight requests to complete

Requirement 6: HTTP Client

User Story: As a Coordinator, I want to send HTTP requests to Workers, so that I can distribute search tasks and collect results.

Acceptance Criteria

THE HTTP_Client SHALL send POST requests with binary payloads to worker endpoints
THE HTTP_Client SHALL support concurrent requests to multiple workers using goroutines
WHEN sending tasks to workers, THE HTTP_Client SHALL return results through channels for aggregation
IF a worker request fails, THEN THE HTTP_Client SHALL handle the error gracefully without crashing the Coordinator
THE HTTP_Client SHALL deserialize worker responses into Result structures

Requirement 7: Search Coordinator

User Story: As a Coordinator, I want to handle search queries by distributing work to available Workers and aggregating results, so that users can search across all documents efficiently.

Acceptance Criteria

WHEN a search request is received, THE Search_Coordinator SHALL parse the protobuf Request message to extract the search query
WHEN processing a search query, THE Search_Coordinator SHALL tokenize the query into individual search terms
WHEN distributing work, THE Search_Coordinator SHALL retrieve the list of available workers from the workers service registry
IF no workers are available, THEN THE Search_Coordinator SHALL return an empty response
WHEN distributing work, THE Search_Coordinator SHALL split the document list evenly among available workers
WHEN creating tasks, THE Search_Coordinator SHALL include the search terms and assigned document paths in each Task
THE Search_Coordinator SHALL send tasks to all workers concurrently and wait for all responses
WHEN aggregating results, THE Search_Coordinator SHALL merge all worker results and calculate final TF-IDF scores
WHEN returning results, THE Search_Coordinator SHALL sort documents by score in descending order
THE Search_Coordinator SHALL serialize the response as a protobuf Response message containing document statistics

Requirement 8: Search Worker

User Story: As a Worker, I want to process document search tasks, so that I can calculate term frequencies for my assigned documents.

Acceptance Criteria

WHEN a task request is received, THE Search_Worker SHALL deserialize the Task from the request body
WHEN processing a task, THE Search_Worker SHALL read each assigned document from the filesystem
WHEN processing a document, THE Search_Worker SHALL parse the document content into individual words
WHEN calculating term frequencies, THE Search_Worker SHALL compute the frequency for each search term in each document
THE Search_Worker SHALL create a Result containing DocumentData for each processed document
THE Search_Worker SHALL serialize the Result and return it in the response body
IF a document cannot be read, THEN THE Search_Worker SHALL skip that document and continue processing others

Requirement 9: TF-IDF Algorithm

User Story: As a search system, I want to implement the TF-IDF algorithm, so that documents can be ranked by relevance to search queries.

Acceptance Criteria

WHEN calculating term frequency, THE TFIDF SHALL compute the ratio of term occurrences to total words in the document (case-insensitive)
WHEN creating document data, THE TFIDF SHALL calculate term frequency for each search term and store in DocumentData
WHEN calculating document scores, THE TFIDF SHALL compute IDF as log10(total_documents / documents_containing_term) for each term
IF a term appears in zero documents, THEN THE TFIDF SHALL use an IDF of 0 for that term
WHEN calculating final scores, THE TFIDF SHALL sum the product of TF and IDF for each term
WHEN parsing text, THE TFIDF SHALL split on punctuation (periods, commas, spaces, hyphens, question marks, exclamation marks, semicolons, colons) and newlines

Requirement 10: Data Models and Serialization

User Story: As a developer, I want well-defined data models with serialization support, so that nodes can exchange data reliably.

Acceptance Criteria

THE Task model SHALL contain a list of search terms and a list of document paths
THE Result model SHALL contain a map of document paths to DocumentData
THE DocumentData model SHALL contain a map of terms to their frequencies (float64)
WHEN serializing Task and Result for internal communication, THE Serializer SHALL use JSON encoding
WHEN deserializing Task and Result, THE Serializer SHALL reconstruct the original data structures
FOR ALL valid Task objects, serializing then deserializing SHALL produce an equivalent object (round-trip property)
FOR ALL valid Result objects, serializing then deserializing SHALL produce an equivalent object (round-trip property)

Requirement 11: Protocol Buffers API

User Story: As an external client, I want to communicate with the search cluster using Protocol Buffers, so that I can efficiently send queries and receive results.

Acceptance Criteria

THE Protobuf schema SHALL define a Request message with a required search_query string field
THE Protobuf schema SHALL define a Response message with a repeated DocumentStats field
THE DocumentStats message SHALL contain document_name (required), score (optional), document_size (optional), and author (optional) fields
WHEN receiving a search request, THE Search_Coordinator SHALL parse the protobuf Request message
WHEN sending a search response, THE Search_Coordinator SHALL serialize the Response as protobuf

Requirement 12: Application Lifecycle

User Story: As an operator, I want to start and stop cluster nodes cleanly, so that the cluster remains stable during deployments.

Acceptance Criteria

WHEN the application starts, THE Application SHALL accept a port number as a command-line argument (defaulting to 8080)
WHEN the application starts, THE Application SHALL connect to ZooKeeper, initialize service registries, and begin leader election
WHILE running, THE Application SHALL block the main goroutine until a shutdown signal is received
WHEN a shutdown signal is received, THE Application SHALL unregister from service registries and close the ZooKeeper connection
THE Application SHALL handle SIGINT and SIGTERM signals for graceful shutdown

Requirement 13: Go-Specific Implementation

User Story: As a Go developer, I want the implementation to follow Go idioms and best practices, so that the code is maintainable and performant.

Acceptance Criteria

THE implementation SHALL use the go-zookeeper/zk library for ZooKeeper integration
THE implementation SHALL use the standard net/http package for HTTP server and client
THE implementation SHALL use google.golang.org/protobuf for Protocol Buffers
THE implementation SHALL use goroutines and channels for concurrent task distribution
THE implementation SHALL use idiomatic Go error handling with explicit error returns
THE implementation SHALL organize code into packages following Go conventions (cluster, networking, search, model)
THE implementation SHALL define interfaces for testability (OnElectionCallback, RequestHandler)

Distributed Document Search In Golang Requirements Doc

Requirements Document

Introduction

Glossary

Requirements

Requirement 1: ZooKeeper Connection Management

Acceptance Criteria

Requirement 2: Leader Election

Acceptance Criteria

Requirement 3: Service Registry

Acceptance Criteria

Requirement 4: Role Transition

Acceptance Criteria

Requirement 5: HTTP Server

Acceptance Criteria

Requirement 6: HTTP Client

Acceptance Criteria

Requirement 7: Search Coordinator

Acceptance Criteria

Requirement 8: Search Worker

Acceptance Criteria

Requirement 9: TF-IDF Algorithm

Acceptance Criteria

Requirement 10: Data Models and Serialization

Acceptance Criteria

Requirement 11: Protocol Buffers API

Acceptance Criteria

Requirement 12: Application Lifecycle

Acceptance Criteria

Requirement 13: Go-Specific Implementation

Acceptance Criteria