5.1 Data Security and Cryptography
Navigation
What This Module Covers
Domain 5 is the daily practice of protecting data and systems. In this lesson, the focus is on how cryptography protects confidentiality, integrity, and authenticity; how data is classified and destroyed; and how logging feeds a SIEM so security teams can detect suspicious behavior.
Think of cryptography as a toolbox with different jobs:
- Encryption keeps data secret.
- Hashing proves data has not changed.
- Digital signatures prove who sent something and that it was not altered.
- PKI is the trust system that makes certificates usable at scale.
- Key management is the discipline that keeps the whole system from failing.
The exam often asks you to distinguish similar concepts. The trick is not to memorize isolated facts, but to understand which tool solves which problem.
1. Cryptography Fundamentals
Cryptography is the science of protecting information using mathematical algorithms and keys. A good cryptographic system usually provides one or more of the following:
- Confidentiality: only authorized parties can read the data.
- Integrity: the data has not been altered.
- Authentication: the sender or system is verified.
- Non-repudiation: a sender cannot credibly deny sending the data.
Plaintext, Ciphertext, Algorithm, and Key
| Term | Meaning | |---|---| | Plaintext | Readable original data | | Ciphertext | Scrambled output after encryption | | Algorithm | The mathematical method used | | Key | The secret value that controls the algorithm |
Encryption is not the same as encoding. Encoding makes data representable in another format, like Base64. Encryption is meant to protect data.
Real-world analogy
Encryption is like putting a document into a locked safe. The algorithm is the safe design. The key is the combination. Without the combination, the document is still there, but it is useless to an attacker.
2. Symmetric Encryption
Symmetric encryption uses one shared secret key for both encryption and decryption. It is fast and efficient, so it is used for bulk data.
Common Symmetric Algorithms
| Algorithm | Status | Notes | |---|---|---| | AES | Current standard | Use AES-128, AES-192, or AES-256 | | DES | Obsolete | 56-bit key is too small | | 3DES | Deprecated | Legacy compatibility only | | Blowfish | Older but still seen | Variable key size, historically popular | | RC4 | Broken | Do not use |
AES
AES is the modern symmetric standard. The key sizes are 128, 192, and 256 bits. Larger keys generally mean more brute-force resistance, though implementation quality matters too.
Why AES matters on the exam:
- If a question asks for the current standard symmetric cipher, choose AES.
- If the question asks for bulk encryption, choose symmetric cryptography.
DES
DES used a 56-bit key. That key space is far too small by modern standards and can be brute-forced. In exam language, DES is obsolete.
3DES
3DES applies DES three times to extend its life. It was used as a bridge from DES to AES, but it is deprecated. If a test question presents 3DES as a modern recommendation, that is a red flag.
Blowfish
Blowfish is a symmetric block cipher with a variable key length. It still appears in legacy systems and some products, but it is not the first-choice modern answer when AES is available.
RC4
RC4 is a stream cipher that has been broken by multiple weaknesses. It showed up historically in TLS and WEP-related contexts, but it should not be chosen for secure designs.
Symmetric strengths and limitations
| Strength | Limitation | |---|---| | Very fast | Key distribution is hard | | Good for large files and data streams | Both parties need the same secret | | Efficient on constrained systems | If the key leaks, secrecy is lost |
Example
An organization encrypts a backup archive before storing it in cloud storage. AES is the right family because the archive may be huge and needs fast, practical encryption.
3. Asymmetric Encryption
Asymmetric cryptography uses a key pair: a public key and a private key. What one key does, only the other can reverse or verify.
Asymmetric cryptography is slower than symmetric cryptography, so it is usually used for key exchange, identity, and signatures rather than for encrypting large files.
Common Asymmetric Algorithms
| Algorithm | Primary Use | Notes | |---|---|---| | RSA | Encryption and signatures | Common key sizes: 2048 and 4096 | | ECC | Encryption, key agreement, signatures | Smaller keys than RSA | | Diffie-Hellman | Key exchange only | Not used to encrypt data directly | | DSA | Signatures only | Not an encryption algorithm |
RSA
RSA is widely used and often appears in certificates, signatures, and key exchange scenarios. For CC-level study, remember the common sizes 2048 and 4096 bits.
ECC
Elliptic Curve Cryptography provides similar security to RSA with smaller key sizes. That makes it efficient, especially on mobile devices and resource-constrained systems.
Diffie-Hellman
Diffie-Hellman is used for key exchange only. Two parties can derive a shared secret over an insecure channel without sending the secret itself. The algorithm helps establish a session key, but it does not directly encrypt the actual application data.
DSA
DSA is used for digital signatures only. It is not meant to encrypt data. When the question asks about signatures and DSA is one of the choices, that is the right conceptual match.
Exam comparison table
| Need | Best Match | |---|---| | Fast bulk data encryption | AES | | Key exchange over insecure network | Diffie-Hellman | | Digital signatures | RSA or DSA | | Smaller key sizes with similar security | ECC | | Legacy encryption or signatures | RSA in many systems |
4. Hybrid Encryption and TLS
Most real secure systems use hybrid encryption. The reason is simple: symmetric crypto is fast, but asymmetric crypto solves the key distribution problem.
How Hybrid Encryption Works
1. The client and server establish trust and negotiate algorithms. 2. Asymmetric cryptography helps authenticate the server and securely exchange session material. 3. A symmetric session key is created. 4. The actual data is encrypted with the symmetric key.
This is why TLS is a hybrid system. It uses public-key cryptography during the handshake and symmetric encryption for the bulk of the session.
TLS handshake analogy
Think of it like meeting someone in public:
- Asymmetric crypto is the identity check and handoff of a sealed envelope.
- Symmetric crypto is the private conversation after both sides agreed on a secret code for the rest of the meeting.
Why hybrid design matters
If TLS used only asymmetric encryption for all traffic, it would be too slow. If it used only symmetric encryption without a secure way to share the key, attackers could intercept the key exchange. Hybrid encryption solves both problems.
5. Hashing
Hashing is a one-way function that turns input data into a fixed-length output called a digest.
Key properties of hashes
- One-way: you cannot reverse the hash to recover the original data.
- Fixed output: the output size is constant for a given algorithm.
- Avalanche effect: a tiny change in input creates a very different output.
- Integrity-focused: hashes help detect change, not hide data.
Common Hash Algorithms
| Algorithm | Status | Notes | |---|---|---| | MD5 | Broken | Collision attacks make it unsuitable for security | | SHA-1 | Deprecated | No longer trusted for security use | | SHA-256 | Standard | Widely used and accepted | | SHA-3 | Modern standard | Newer design family |
MD5 and SHA-1
MD5 and SHA-1 are often mentioned in legacy systems and old documents. In modern security work, they should not be used for protection where collision resistance matters.
SHA-256 and SHA-3
SHA-256 is the common exam answer for a secure modern hash. SHA-3 is also a standard and may appear as a newer alternative.
Hashing example
When a file is downloaded, the vendor may publish a SHA-256 hash. You compute the hash locally and compare it to the vendor's value. If they match, the file likely arrived unchanged.
Important distinction
Hashing is not encryption.
| Feature | Encryption | Hashing | |---|---|---| | Reversible | Yes, with the key | No | | Purpose | Confidentiality | Integrity | | Output | Variable depending on plaintext size | Fixed length | | Uses key | Often yes | Usually no |
6. Digital Signatures
A digital signature proves integrity, authenticity, and non-repudiation. It is created with a hash and the sender's private key.
Step-by-step process
1. The sender computes a hash of the message. 2. The sender encrypts the hash with the private key. This creates the signature. 3. The message and signature are sent to the receiver. 4. The receiver computes their own hash of the received message. 5. The receiver decrypts the signature using the sender's public key. 6. If the two hashes match, the message is intact and the sender is verified.
What signatures do and do not do
| Provides | Does Not Provide | |---|---| | Integrity | Confidentiality | | Authentication | Hiding the message content | | Non-repudiation | Performance on large data |
Example
A software vendor signs an installer. The signature tells you the file really came from that vendor and was not changed after signing.
7. PKI and Certificates
Public Key Infrastructure is the system of policies, technologies, people, and processes that manage certificates and trust.
Main PKI components
| Component | Function | |---|---| | CA | Certificate Authority. Issues and signs certificates | | RA | Registration Authority. Verifies identity before issuance | | Certificate | Binds an identity to a public key | | CRL | Certificate Revocation List. Lists revoked certificates | | OCSP | Online Certificate Status Protocol. Checks revocation status in real time |
Certificate contents
A certificate normally contains:
- Subject name
- Subject public key
- Issuer name
- Validity period
- Serial number
- Digital signature from the CA
Certificate lifecycle
1. Request 2. Identity validation 3. Issuance 4. Use 5. Renewal 6. Revocation or expiration
CRL vs OCSP
| Method | Description | Advantage | Limitation | |---|---|---|---| | CRL | List of revoked certs | Simple and familiar | Can be stale | | OCSP | Online status check | More current | Depends on responder availability |
Real-world analogy
PKI is like a passport system. The CA is the government office that issues the passport, the certificate is the passport, and revocation is like canceling a passport if it is compromised.
8. Key Management
Cryptography fails if key management is weak. The strongest algorithm cannot save a key that was emailed around, stored in plain text, or never rotated.
Key management lifecycle
| Phase | Good Practice | |---|---| | Generate | Use strong randomness and approved algorithms | | Distribute | Use secure channels and access controls | | Store | Protect in HSMs, key vaults, or secure modules | | Rotate | Replace keys on schedule or after events | | Revoke | Disable compromised or expired keys | | Destroy | Remove keys securely when no longer needed |
Common controls
- Hardware Security Module (HSM)
- Cloud key vaults
- Separation of duties
- Dual control for sensitive operations
- Audit logging for key use
Example
If an encryption key protecting payroll data is exposed, the organization should rotate that key and consider re-encrypting sensitive data. The risk is not the algorithm alone. It is whether the secret remained secret.
9. Data Classification
Classification helps an organization decide how to handle data based on sensitivity and business impact.
Government classification
| Level | Meaning | |---|---| | Unclassified | No special protection required | | Confidential | Sensitive, limited distribution | | Secret | Serious harm if disclosed | | Top Secret | Exceptionally grave harm if disclosed |
Commercial classification
| Level | Meaning | |---|---| | Public | Intended for anyone | | Internal | For employees or internal use | | Confidential | Sensitive business data | | Restricted | Highest sensitivity in many corporate schemes |
Labeling
Classification only helps if users can see it. Labels may appear in:
- Document headers and footers
- Watermarks
- Email banners
- Metadata
- File properties
10. Data States and Lifecycle
Data has three major states:
| State | Meaning | Common Protection | |---|---|---| | At rest | Stored on disk, in backup, in database | Encryption, access control | | In transit | Moving across a network | TLS, VPN, IPsec | | In use | Being processed in memory | Hardening, memory protections, access control |
The data lifecycle is not just creation and deletion. It includes collection, storage, use, sharing, archival, and final destruction.
Lifecycle view
1. Create or collect 2. Classify and label 3. Store securely 4. Process or use 5. Share or transmit if authorized 6. Archive or retain per policy 7. Destroy when no longer needed
11. Data Destruction
Destroying data means making recovery impractical or impossible.
Methods of destruction
| Method | Best Use | Notes | |---|---|---| | Degaussing | Magnetic media | Not for SSDs | | Overwriting | Some disks and media | May be unreliable on modern SSDs | | Shredding | Physical destruction | Very effective | | Crypto-shredding | Encrypted data | Destroy the key, data becomes unreadable | | Incineration | Physical media | Highly destructive |
Important exam rule
Degaussing does not work on SSDs because SSDs are not magnetic media. This is a frequent exam trap.
Choosing the right destruction method
- For old hard drives: overwriting, shredding, or degaussing can be appropriate.
- For encrypted cloud data: crypto-shredding is often the fastest and cleanest option.
- For paper records: shredding or incineration.
12. Logging and SIEM
Logging is essential because you cannot defend what you cannot see. Security logs provide evidence, support investigations, and reveal patterns of abuse.
What should be logged
- Authentication events
- Authorization failures
- File and object access
- Administrative changes
- Security alerts
- System and application errors
SIEM
SIEM stands for Security Information and Event Management. It centralizes log data, normalizes it, correlates events, and supports alerting and reporting.
SIEM workflow
1. Collect logs from many systems. 2. Normalize different formats. 3. Correlate events to find patterns. 4. Alert on suspicious activity. 5. Retain logs for forensics and compliance.
Why logs matter
Logs can show:
- A failed login brute-force attempt
- A privileged account used at an unusual time
- A certificate issued and then revoked
- A file deleted before a suspicious exfiltration event
Log protection
Logs themselves are evidence, so they must be protected from tampering. Common controls include access restrictions, write-once storage, hashing, and central collection.
Exam Tips
- AES is the modern symmetric standard.
- RSA is common for encryption and signatures.
- Diffie-Hellman is for key exchange only.
- DSA is for signatures only.
- Hashes are one-way and fixed length.
- Digital signatures use a hash plus the sender's private key.
- Degaussing does not work on SSDs.
- SIEM centralizes and correlates logs.
Practice Questions
1. Which algorithm is the current symmetric encryption standard? ✅ AES
2. Which algorithm is considered broken and should not be used for security? ✅ MD5
3. What is the primary purpose of Diffie-Hellman? ✅ Key exchange only
4. Which asymmetric algorithm is used for digital signatures only? ✅ DSA
5. What property means a small input change creates a very different hash output? ✅ Avalanche effect
6. What does a digital signature provide besides integrity? ✅ Authentication and non-repudiation
7. What PKI component checks certificate revocation in real time? ✅ OCSP
8. Which destruction method does not work on SSDs? ✅ Degaussing
9. What data state applies when information is being processed in memory? ✅ In use
10. What system centralizes logs and correlates events? ✅ SIEM