hashfile

package module
v1.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 19, 2025 License: MIT Imports: 10 Imported by: 0

README

hashfile

Go Reference Go Report Card

A lightweight, efficient tool for adding CRC32-based integrity comments to source code files. hashfile helps you detect unintended changes to your source files by embedding a checksum comment that can be quickly verified.

Features

  • Efficient streaming algorithm - Single buffer allocation with sliding window, processes files of any size
  • No-op on unchanged files - Only modifies files when the integrity comment is missing or incorrect
  • Multiple language support - Auto-detects comment style based on file extension (Go, Python, C/C++, SQL, HTML, Shell, Ruby, JavaScript, CSS, Templ)
  • Preserves file attributes - Maintains permissions and ownership when updating files
  • Line ending aware - Preserves CRLF vs LF line endings
  • Zero dependencies - Uses only Go standard library

Use Cases

  • Verify source code hasn't been modified during deployment
  • Detect accidental edits to configuration files
  • Ensure generated code remains unchanged
  • Quick integrity checks for code reviews
  • Build system verification step

Note: This is NOT a security tool. It uses CRC32 for fast integrity checking, not cryptographic security. Do not use this to detect malicious modifications.

Installation

From Source
go install github.com/dmoose/hashfile/cmd/hashfile@latest
Build from Repository
git clone https://github.com/dmoose/hashfile.git
cd hashfile
make build
# Binary will be in bin/hashfile
As a Library
go get github.com/dmoose/hashfile

Command Line Usage

Add/Update Integrity Comments

Add integrity comments to one or more files:

# Single file
hashfile add main.go

# Multiple files
hashfile add main.go handler.go utils.go

# Using glob patterns
hashfile add src/*.go

# Specify comment style explicitly
hashfile add -style=python script.txt

What happens:

  • Calculates CRC32 of file content (excluding the integrity comment itself)
  • Adds a comment line at the end: // FileIntegrity: ABCD1234
  • If comment already exists and is correct, file is not modified (no-op)
  • If comment exists but is wrong, it's updated with the correct hash
Verify File Integrity

Check if files have been modified:

# Verify files (silent mode, use exit code)
hashfile verify *.go
echo $?  # 0 = all valid, 1 = invalid or error

# Quiet mode (no output at all)
hashfile verify -q *.go

Exit codes:

  • 0 - All files verified successfully
  • 1 - One or more files invalid or errors occurred
Check with Output

Display human-readable verification status:

hashfile check src/*.go

Example output:

✓ src/main.go
✓ src/handler.go
✗ src/utils.go (integrity check failed)

Total: 3 files, 2 valid, 1 invalid, 0 errors

Library Usage

Basic Example
package main

import (
    "fmt"
    "log"
    
    "github.com/dmoose/hashfile"
)

func main() {
    // Add integrity comment (auto-detects .go extension)
    if err := hashfile.ProcessFile("main.go"); err != nil {
        log.Fatal(err)
    }
    
    // Verify integrity
    valid, err := hashfile.VerifyFile("main.go")
    if err != nil {
        log.Fatal(err)
    }
    
    if valid {
        fmt.Println("File integrity verified!")
    } else {
        fmt.Println("File has been modified!")
    }
}
Custom Configuration
package main

import (
    "github.com/dmoose/hashfile"
)

func main() {
    // Create custom configuration
    config := hashfile.Config{
        CommentStyle: hashfile.PythonStyle,
        BufferSize:   128 * 1024, // 128KB buffer
    }
    
    // Use custom config
    writer := hashfile.NewWriter(config)
    writer.ProcessFile("script.py")
    
    reader := hashfile.NewReader(config)
    valid, _ := reader.VerifyFile("script.py")
}
Supported Comment Styles
hashfile.GoStyle      // FileIntegrity: ABCD1234
hashfile.PythonStyle  # FileIntegrity: ABCD1234
hashfile.CStyle       // FileIntegrity: ABCD1234
hashfile.SQLStyle     -- FileIntegrity: ABCD1234
hashfile.HTMLStyle    <!-- FileIntegrity: ABCD1234 -->
hashfile.ShellStyle   # FileIntegrity: ABCD1234
hashfile.RubyStyle    # FileIntegrity: ABCD1234
hashfile.JSStyle      // FileIntegrity: ABCD1234
hashfile.CSSStyle     /* FileIntegrity: ABCD1234 */
hashfile.TemplStyle   const FileIntegrity = "ABCD1234"

Note: TemplStyle uses a Go constant declaration instead of a comment. Since templ files compile to Go code, this allows the integrity hash to be embedded in generated HTML comments for traceability (e.g., <!-- Template Integrity: { FileIntegrity } -->).

Auto-Detection by Extension

The library automatically selects the appropriate comment style:

Extension Comment Style
.go // ...
.py # ...
.c, .h, .cpp, .java, .js, .ts // ...
.sql -- ...
.html, .xml <!-- ... -->
.sh, .bash # ...
.rb # ...
.css, .scss, .sass /* ... */
.templ const FileIntegrity = "..."

How It Works

Algorithm Overview
  1. Streaming with Sliding Window

    • Reads file in chunks (default 64KB buffer)
    • Maintains a small sliding window (≈30 bytes) at the end
    • Calculates CRC32 of all content except the window
    • Single buffer allocation for efficiency
  2. At End of File

    • Examines the window for existing integrity comment
    • If found and correct: writes temp file, compares to original, no-op if identical
    • If found but wrong: updates comment with new CRC
    • If not found: adds new comment
  3. Content Hashing

    • CRC32 includes all file content EXCEPT the integrity comment line itself
    • All content including trailing newlines gets CRC'd
    • If a file doesn't end with a newline, one is added and CRC'd before adding the comment
  4. No-Op Optimization

    • Always writes to temp file during streaming
    • If existing integrity comment has matching CRC, deletes temp and leaves original untouched
    • When file is modified, preserves attributes (permissions, ownership) during replacement
Example

Before:

package main

func main() {
    println("Hello")
}

After adding integrity:

package main

func main() {
    println("Hello")
}

// FileIntegrity: A1B2C3D4

Example with CSS:

body {
    color: red;
}

/* FileIntegrity: E5F6A7B8 */

Example with Templ:

package components

templ Hello() {
    <div>Hello</div>
}

const FileIntegrity = "C9D0E1F2"

If you modify the code:

package main

func main() {
    println("Hello, World!")  // Changed
}
// FileIntegrity: A1B2C3D4  // Now invalid!

Verification will fail until you run hashfile add again to update the hash.

Performance

The streaming algorithm is optimized for efficiency:

  • Memory: Single 64KB buffer allocation (configurable)
  • I/O: Buffered reads/writes, minimal system calls
  • Processing: Each byte processed exactly once for CRC
  • Allocations: Minimal heap allocations (30 for process, 7 for verify on ~26KB file)

Benchmark results (Apple M1 Max):

BenchmarkProcessFile-10    	    3615	    304277 ns/op	   71972 B/op	      30 allocs/op
BenchmarkVerifyFile-10     	   66835	     17837 ns/op	   66386 B/op	       7 allocs/op

On a ~26KB test file: ~3,600 process operations/sec, ~66,000 verify operations/sec.

Testing

Run the test suite:

# All tests
make test

# With coverage report
make test-coverage

# Benchmarks
make bench

Building

# Build binary
make build

# Build and test
make all

# Install to $GOPATH/bin
make install

# Clean artifacts
make clean

Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new functionality
  4. Ensure all tests pass: make test
  5. Run linter: make lint (requires golangci-lint)
  6. Submit a pull request

License

MIT License - see LICENSE file for details.

FAQ

Q: Why CRC32 instead of SHA256 or other cryptographic hash?
A: This tool is designed for detecting accidental changes, not malicious tampering. CRC32 is extremely fast and sufficient for this purpose. If you need cryptographic security, use proper code signing tools.

Q: What if I need to edit a file with an integrity comment?
A: Just edit it normally. The integrity comment will show the file has changed. Run hashfile add again to update the comment with the new hash.

Q: Does this work with binary files?
A: Technically yes, but don't. Will append text to binary file which is almost certainly not what you wanted.

Q: Can I use this in CI/CD pipelines?
A: Yes! Use hashfile verify -q *.go and check the exit code. It's silent and fast.

Q: What happens if I manually change the hash in the comment?
A: Verification will fail. The hash must match the actual CRC32 of the file content.

Q: Does it modify the file's timestamp?
A: Only if the file actually needs to be updated. If the hash is already correct (no-op case), the file is not touched and timestamps are preserved.

Documentation

Overview

Package hashfile provides efficient streaming CRC32-based file integrity checking. It adds a comment line at the end of files containing a CRC32 checksum of the file content (excluding the comment itself). This allows detection of changes to source code files.

The package uses a streaming algorithm with a single buffer allocation and sliding window, making it efficient for files of any size. It preserves file attributes (permissions, ownership) and only modifies files when the integrity comment is missing or incorrect.

Index

Constants

This section is empty.

Variables

View Source
var (
	GoStyle     = CommentStyle{Prefix: "// ", Suffix: "", PrefixContainsKey: false}
	CStyle      = CommentStyle{Prefix: "// ", Suffix: "", PrefixContainsKey: false}
	PythonStyle = CommentStyle{Prefix: "# ", Suffix: "", PrefixContainsKey: false}
	SQLStyle    = CommentStyle{Prefix: "-- ", Suffix: "", PrefixContainsKey: false}
	HTMLStyle   = CommentStyle{Prefix: "<!-- ", Suffix: " -->", PrefixContainsKey: false}
	ShellStyle  = CommentStyle{Prefix: "# ", Suffix: "", PrefixContainsKey: false}
	RubyStyle   = CommentStyle{Prefix: "# ", Suffix: "", PrefixContainsKey: false}
	JSStyle     = CommentStyle{Prefix: "// ", Suffix: "", PrefixContainsKey: false}
	CSSStyle    = CommentStyle{Prefix: "/* ", Suffix: " */", PrefixContainsKey: false}
	TemplStyle  = CommentStyle{Prefix: "const FileIntegrity = \"", Suffix: "\"", PrefixContainsKey: true}
)

Predefined comment styles for common languages.

Functions

func ProcessFile

func ProcessFile(filename string) error

ProcessFile adds or updates integrity comment with auto-detected comment style.

func ProcessGoFile

func ProcessGoFile(filename string) error

ProcessGoFile adds or updates integrity comment in a Go source file.

func VerifyFile

func VerifyFile(filename string) (bool, error)

VerifyFile verifies file integrity with auto-detected comment style.

func VerifyGoFile

func VerifyGoFile(filename string) (bool, error)

VerifyGoFile verifies the integrity of a Go source file.

Types

type CommentStyle

type CommentStyle struct {
	Prefix            string // Comment prefix (e.g., "// " for Go/C)
	Suffix            string // Comment suffix (e.g., " -->" for HTML, empty for most)
	PrefixContainsKey bool   // If true, Prefix already includes "FileIntegrity" (e.g., for const declarations)
}

CommentStyle defines the comment format for different programming languages.

type Config

type Config struct {
	CommentStyle CommentStyle
	BufferSize   int // Buffer size for streaming (default 64KB)
}

Config holds processing configuration.

func ConfigForExtension

func ConfigForExtension(ext string) Config

ConfigForExtension returns a Config with appropriate comment style for the given file extension. Returns DefaultConfig for unknown extensions.

func DefaultConfig

func DefaultConfig() Config

DefaultConfig returns configuration with Go-style comments and standard buffer size.

type Reader

type Reader struct {
	// contains filtered or unexported fields
}

Reader verifies file integrity using the same efficient streaming approach.

func NewReader

func NewReader(config Config) *Reader

NewReader creates a Reader with the given configuration.

func (*Reader) VerifyFile

func (r *Reader) VerifyFile(filename string) (bool, error)

VerifyFile checks if a file's integrity comment matches its content.

type Writer

type Writer struct {
	// contains filtered or unexported fields
}

Writer processes files using efficient streaming algorithm.

func NewWriter

func NewWriter(config Config) *Writer

NewWriter creates a Writer with the given configuration.

func (*Writer) ProcessFile

func (w *Writer) ProcessFile(filename string) error

ProcessFile adds or updates the integrity comment in a file. It uses a streaming algorithm to minimize memory usage and only modifies the file if the integrity comment is missing or incorrect. File attributes (permissions, ownership) are preserved.

Directories

Path Synopsis
cmd
hashfile command
hashfile is a command-line tool for adding, verifying, and managing file integrity comments in source code files.
hashfile is a command-line tool for adding, verifying, and managing file integrity comments in source code files.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL