Summary
The ChecksumCalculator class within allows for hashing and checksum generation, but it includes or defaults to algorithms that are no longer recommended for secure cryptographic use cases (e.g., SHA-1, CRC32, and SSDEEP). These algorithms, while possibly valid for certain non-security-critical tasks, can expose users to security risks if used in scenarios where strong cryptographic guarantees are required.
Requirement from NIST
Requirement from NIST regarding SHA1
https://csrc.nist.gov/projects/hash-functions#:~:text=NIST%20deprecated%20the%20use%20of,use%20of%20the%20SHA%2D1.
Federal agencies should use SHA-2 or SHA-3 as an alternative to SHA-1.
Further guidance will be available soon. Send questions on the transition to sha-1-transition@nist.gov.
https://www.nist.gov/news-events/news/2022/12/nist-retires-sha-1-cryptographic-algorithm
Mitigation and Fix
Make it clear to developers and users that the ChecksumCalculator is specific to the "Known File Filter" (KFF) document similarity feature and is not intended to suggest or endorse global use as a cryptographically secure hashing or checksum mechanism.
While these specific default insecure algorithms can not be updated without violating the intended use-case, it can be clearly documented and prevented using better access modifiers in the ChecksumCalculator class.
Details
Within ChecksumCalculator.java, the following points raise potential security concerns:
SHA-1:
SHA-1 has been widely deprecated for cryptographic purposes due to known collision attacks.
The constructor defaults to "SHA-1" if no specific algorithm is provided.
CRC32:
CRC32 is a simple checksum mechanism, not a cryptographic hash function. It is unsuitable for security-critical integrity checks since it can be easily manipulated or collided.
SSDEEP (Fuzzy Hashing):
SSDEEP is a context-specific tool used for similarity matching and may not be a secure cryptographic function for authentication or tamper detection.
There is no apparent mechanism to prevent developers from using these weaker algorithms in security-sensitive contexts. Users of emissary who rely on ChecksumCalculator for strong security guarantees (e.g., data integrity or authentication) may be misled into assuming these algorithms provide adequate protection.