Skip to main content

DSAPath18 minintermediate

Python Bytes and bytearray

Python bytes and bytearray are byte-level sequence tools for encodings, binary protocols, buffers, and integer packing.

DifficultyIntermediate
TierTier 2
ModulePython Practice
LanguagesPython
Review this page laterSign in to save DSA topics and keep review state.
Sign in to save

Why This Matters

Most DSA practice uses lists and strings, but real systems also move raw bytes. Hashing, parsers, file formats, network protocols, compression, and low-level encodings all need byte-level thinking.

Use this page when a problem stops being "characters in a string" and becomes "bytes in a buffer."

Core Idea

bytes is immutable. bytearray is mutable. Both are sequences of integers from 0 to 255.

ToolMutable?Main use
bytesnofixed binary data, encoded text, hash inputs
bytearrayyeseditable byte buffer
memoryviewviewslice buffers without copying in many cases
encodecreates bytestext to bytes
decodecreates strbytes to text
int.to_bytescreates bytespack integer into fixed byte width
int.from_bytescreates intread integer from bytes

Endianness

Endianness is byte order. Big-endian stores the most significant byte first. Little-endian stores the least significant byte first.

(258).to_bytes(2, "big") == bytes([0x01, 0x02])
(258).to_bytes(2, "little") == bytes([0x02, 0x01])

State the byte order whenever bytes represent a number.

Non-Example or Failure Mode

A Python str is text, not raw bytes. Indexing "A" gives a one-character string, while indexing b"A" gives integer 65. Confusing those two models causes parsing and encoding bugs.

Worked Example

You receive two bytes representing a length field in big-endian order:

packet = bytes([0x01, 0x2C])
length = int.from_bytes(packet, "big")

The result is 300 because 0x012C equals decimal 300.

Common Mistakes

MistakeCorrection
Treating text and bytes as the same thing.Use encode and decode at the boundary.
Forgetting that bytes[i] is an integer.Compare to 65 or use bytes([65]) for a one-byte object.
Mutating bytes.Use bytearray for editable buffers.
Omitting byte order.Pass "big" or "little" to integer packing methods.
Assuming slicing always stays zero-copy.Use memoryview when avoiding copies matters.

Diagnostic Questions

Question typeQuestionAnswer signal
DefinitionWhat values can a byte hold?Integers from 0 to 255.
Example / non-exampleIs b"abc" a Python str?No. It is bytes.
ComputationWhat is b"A"[0]?65.
TransferWhy does endianness matter?The same bytes can represent different numbers under different byte orders.

Runnable Drill

Python bytes and bytearray drill

Checks encoding, mutation, memoryview, hex, and integer byte order.

Output will appear here.

Exercises

Beginner:

  • Convert "heap" to UTF-8 bytes and back.
  • Explain why b"A"[0] is 65.
  • Mutate a bytearray in place.

Intermediate:

  • Pack and unpack 65535 using two bytes.
  • Compare big-endian and little-endian output for the same integer.

Challenge:

  • Write a parser for a toy packet format: two bytes of length followed by that many payload bytes.

Diagram Recommendation

Type: byte-order diagram.

Caption: "The integer 258 as two bytes: big-endian 01 02, little-endian 02 01."

Purpose: Make endianness concrete before learners parse binary data.

Next Topics

References