160 Characters

The Smuggling Route

Separate Channels: SS7 protocol sends voice (user data) and control-talk on different lines
Telephone networks have two components to transfer information:
User data: This carries the actual user data to be transferred, your voice on a phone call, for instance.
Control-talk: A set of instructions shared between your device and network. Before your voice travels anywhere, the network needs to find your friend's phone, check if they're available, calculate the best route through potentially dozens of switches, set up billing, and reserve bandwidth.
In the 1980s, a new communication protocol called SS7 (Signaling System 7) was being used. This protocol sent the voice (user data) and the control-talk on separate lines. Formally, this is called common channel signaling.
“SS7 sends data in the form of packets that have fixed sizes. This creates an interesting problem: what happens when your control message doesn't fill the container?”
The 140-Byte Void

140 Bytes of Unused Space: Found in every SS7 packet after accounting for headers
Friedhelm Hillebrand and Bernard Ghillebaert were examining SS7 packet structures. After accounting for all the headers like message type indicators, originating point codes, destination point codes, and circuit identification codes, they found something remarkable: 140 bytes of unused space in every packet.
“That is when an idea popped in his head: what if we use this idle space to smuggle messages between devices?”
And that brings us to our 160-character limitation for SMS messages, or to be more technically precise, 140 bytes.

Encoding Calculator
Type a message and see how different encoding schemes affect byte usage in real-time.
GSM-7 Encoding
Uses 7 bits per character. Supports basic Latin alphabet, numbers, and common symbols. Allows up to 160 characters in 140 bytes.
UCS-2 Encoding
Uses 16 bits per character. Required for emojis and non-Latin scripts. Reduces capacity to just 70 characters in the same 140 bytes.
Try it: Start with plain text, then add an emoji. Watch the encoding switch and character limit drop from 160 to 70!
140-Byte Limit
11 / 140Standard text. GSM-7 allows 160 chars.
Math Behind the Limit

GSM-7 Encoding: Uses 7 bits per character instead of 8
As we know, 1 byte = 8 bits. Thus, 140 bytes translate into 1120 bits:
GSM-7 encoding is one of the protocols for encoding SMS. We use 7 bits for each character so,
“This same 160 character limit shrinks to 70 character limit when we use emojis or alphabets from other languages.”
This happens because when we use non-GSM characters, we need a different encoding like UCS-2. UCS-2 uses 16 bits for each character so:
The Encoding Wars
GSM-7 vs UCS-2: More characters vs more character types
GSM-7 uses seven bits to represent a character. 7 bits mean we can have 27 different combinations of 0's and 1's i.e. we can have 128 different characters.
GSM-7 is very limited and can't encode emojis and most of non-european languages. To solve this problem, a new encoding format called UCS-2 (UTF-16) was invented.
The '2' in UCS-2 stands for 2 bytes, because it uses 2 bytes to store each character i.e. now we have 16 bits instead of 7.
“When sending messages, if it contains even a single non-GSM character, the entire message gets replaced with UCS-2 encoding.”
Concatenation: A Hidden Transport Layer
Message Splitting: Long messages divided into 140-byte segments
Behind the scenes, concatenation works to provide this seamless experience. Messages longer than 160 characters are divided into 140-byte segments and are sent over the network (SS7). On the receiver's device we connect all these segments by proper order.
The UDH (user data header) is used to facilitate proper concatenation. Each segment's header contains some metadata which informs the receiver of the correct order.
“UDH needs 6 bytes, thus each segment spends 6 additional bytes on UDH. Thus, we are now left with 134 bytes.”
Since SMS pricing is based on segments, you'll pay for the extra segment here.

Message Splitting
Watch how long messages get split into multiple SMS segments with UDH headers.
Single Message
Messages up to 160 characters fit in a single SMS packet using all 140 bytes.
Concatenation
Longer messages are split into segments. Each segment includes a 6-byte UDH (User Data Header) for reassembly, reducing capacity to 153 characters per segment.
The Cost
SMS pricing is per segment. Most carriers charge $0.0083 per SMS. A 300-character message costs 2× as much because it requires 2 segments to send. At scale, these costs add up quickly.
Try it: Type more than 160 characters and watch how the message splits into segments. Notice how concatenation impacts the total cost!
Split into 2 segments. 6-byte UDH per segment.
Segments
The Message Journey
Two-Stage Delivery: Lookup (finding recipient) → Delivery (sending message)
Stage 1: The Lookup
The sender's phone encodes the text into a 140-byte SMS-TPDU using GSM-7 encoding. The SMSC uses MAP protocol to send query packets over the SS7 network asking, "Where is phone number [Recipient's Number] right now?"
“The HLR checks its database and sends a response packet back to the sender's SMSC over the SS7 network.”
Stage 2: The Delivery
Now that the SMSC has the target address, it sends the actual message. The sender's SMSC sends this wrapped packet to receiver's current MSC on the SS7 control plane. MSC receives this SS7 packet, unwraps the envelopes and sends the message to receiver.
The Modern Shift: UTF-8 Over IP
From SMS to RCS/iMessage: Moving from control-plane to data-plane (IP)
This new stack is built without the old GSM stack entirely, on two key revolutions:
1. The "Over IP" Revolution
The new way (RCS/iMessage) travels on the Data Plane (the 'IP' part, which stands for Internet Protocol). This is your 4G, 5G, or Wi-Fi connection. RCS converts the messaging from telephony service to internet service.
2. The "UTF-8" Revolution
UTF-8 is a variable-length encoding. It stores A (a simple ASCII character) in 1 byte, é (a common European character) in 2 bytes, 汉 (a Chinese character) in 3 bytes, and 😊 (an emoji) in 4 bytes.
“When you combine the 'superhighway' (IP) with the 'universal language' (UTF-8), you get all the features we now expect from a modern chat app.”
RCS and iMessage freed messaging from the ancient phone network. They abandoned the old approach on the SS7 control channel and turned messaging into what it is today: a modern, fast, and flexible internet application.
Sources
GSM Technical Specification 03.38
3GPP TS 23.040 - Technical realization of the Short Message Service (SMS)
ITU-T Recommendation Q.700 - Introduction to CCITT Signalling System No. 7

Learn more
Explore more about how Greptile works