Why This Matters
String problems are usually sequence problems with extra costs. The characters are easy to scan, but copying substrings, rebuilding strings repeatedly, or using the wrong search primitive can change the complexity.
Use this page before KMP string matching and Rolling hash.
Core Idea
Treat strings as immutable sequences. Scan with indices when performance matters, and build output with lists plus join when repeated concatenation would copy too much.
| Need | Python move | Watch for |
|---|---|---|
| Tokenize text | split | separators and empty fields |
| Build output | join | join strings, not arbitrary objects |
| Find substring | find | returns -1 when absent |
| Prefix check | startswith | clearer than slicing for intent |
| Suffix check | endswith | avoids manual index math |
| Character code | ord | maps char to integer |
| Integer to char | chr | maps code point to char |
| Substring | slicing | creates a new string |
Non-Example or Failure Mode
Repeatedly doing answer += ch inside a long loop can allocate many intermediate strings. Prefer parts.append(ch) and "".join(parts) when building a result incrementally.
Worked Example
Normalize a comma-separated tag string:
raw = " graph, heap , search "
tags = [part.strip() for part in raw.split(",")]
result = "-".join(tags)
The useful point is not the syntax. It is the model: split into parts, clean each part, then join once.
Common Mistakes
| Mistake | Correction |
|---|---|
| Assuming string slicing is free. | Slices allocate new strings. Use indices in hot loops. |
| Rebuilding strings one character at a time. | Collect parts in a list and call join once. |
Forgetting find returns -1. | Check index != -1 before using it. |
| Using slicing for every prefix check. | startswith makes the intent clear. |
| Confusing characters with bytes. | Use encode and decode when crossing the byte boundary. |
Diagnostic Questions
| Question type | Question | Answer signal |
|---|---|---|
| Definition | Are Python strings mutable? | No. Operations create new strings. |
| Example / non-example | Is text[:i] free? | No. It creates a substring copy. |
| Computation | What does "abc".find("z") return? | -1. |
| Transfer | Why use ord in rolling hash? | It converts characters to integer values for arithmetic. |
Runnable Drill
Python string pattern drill
Checks split, join, find, prefix checks, ord, chr, slicing, and a small frequency window.
Output will appear here.
Exercises
Beginner:
- Use
splitandjointo normalize a list of comma-separated words. - Explain the difference between
findandindex. - Convert characters to integer codes with
ord.
Intermediate:
- Detect whether a string has any repeated character.
- Replace repeated slicing in a palindrome check with two indices.
Challenge:
- Implement fixed-window anagram search with
Counter, then explain the update invariant.
Diagram Recommendation
Type: sliding-window character count.
Caption: "A fixed-width window moves across a string while the frequency table is updated by one outgoing and one incoming character."
Purpose: Connect string methods to the same invariants used in array sliding-window problems.
Next Topics
References
- Python documentation. String methods: https://docs.python.org/3/library/stdtypes.html#string-methods
- Python documentation. Sequence operations and slicing: https://docs.python.org/3/library/stdtypes.html#sequence-types-list-tuple-range
- Python documentation. Built-in functions
ordandchr: https://docs.python.org/3/library/functions.html