Bengt Mårtensson > Private Site
 
Font size:      

On the "Kilobyte" and computerists' obsession for power of 2's.

The misuse of a century-old prefix

In the 1970's, (semiconductor) computer memory got out of the doors of the research laboratories. Since it is addressed by a finite number of address lines, each carrying a binary signal, the number of cells in a computer memory chips is almost always a number that is equal to a power of two. Thus, there was a need for a short form of saying "this chip contains 1024 memory cells". Since 1024 = 2^10 is close to 1000, the convention of calling 1024 bytes "1 kByte" was born. Everyone knew that it was strictly speaking wrong, but the community understood it unambiguously. When computer memory, both semiconductor memory and non-volatile memory, increased their capacity, words like "Megabyte" and even "Gigabyte" were born, and, according to the convention, "given" the values of 2^20 = 1048576 and 2^30 = 1073741824 respectively.

This story has been told many times, Googling gives more that enough hits.

Prefixes like "kilo" have been used in science and technology since centuries, and cannot just be "redefined" because someone needs another prefix, no more than pi can be redefined. Since the 1960s they are known as SI prefixes. To provide at least some intellectual consistency, esoteric rules were invented, like: "It is not `kilo-Bytes', it is 'KBytes (pronounced 'kay-bytes')", writing the "k" large, etc.

Not all computer memory is semiconductor memory, addressed with a number of binary signals. Hard disk manufacturers (which did not address their drives like semiconductor memory) sold their products with capacity given in SI Megabytes or Gigabytes, to the extreme rage of the "1 KByte = 1024 Bytes" fraction, who accused the disk manufacturers for using "misleading information", instead of their misused and non-standardized "prefixes".

Psychologically, in such cases, the one claiming that the slightly smaller number is the correct one, a priori tends to make the more serious impression. The one saying "100 Watt RMS" must be more serious than the one saying "150 Watt music effect", just as "76 kWatt" sounds more serious than "100 Horsepowers". Or the one saying that the hard disk contains 100 GB instead of 107 GB... (See this footnote for an almost conspiracy-theory version.)

The "binary prefixes"

So, there was a need for prefixes of the type 2^10n, while leaving the classical SI-prefixes alone. In December 1998 the International Electrotechnical Commission (IEC) defined a number of such prefixes in the standard IEC 60027-2, see NIST or Wikipedia. The following prefixes were defined:

Factor Name Symbol
2^10 kibi Ki
2^20 mebi Mi
2^30 gibi Gi
2^40 tebi Ti
2^50 pebi Pi
2^60 exbi Ei

Thus, the usage of binary prefixes using SI-names is, at least now, not only misleading, but also unnecessary, and should be considered discouraged, see NIST.

Binary is for geeks

Even if we settle for a sensible, common and universally accepted language, calling different things by different names, the binary prefixes, often is not a natural, or even good, choice. It is without doubt that it is practical and natural to speak of a "1 GiB" computer memory or a 32 kiHz clock frequency. However, an advantage of "A one-layer DVD contains 4.377 GiB" over "A one-layer DVD contains 4.7 GB" is not obvious, or, rather, not there. But who cares: both these numbers are equally ugly, why should one ugly number be preferred over the other? However, when we are to do arithmetics, the difference shows:

Problem 1:
Will a 3.83 GB plus a 870 MB file fit on a 4.7 GB DVD?
Problem 2:
Will a 3.567 GiB plus a 829.7 MiB file fit on a 4.377 GiB DVD?

This is the same problem, but first in base 10 and then in base 2. (There are more decimals in the second case, but that is not the point.) In both cases, it will fit exactly. In both cases, the formulation is mathematically unambiguous, and the problem uniquely solvable. However, in the first case, the answer requires a simple addition, which many people can do in their heads (even I :-). The second case is not a hard task as a math assignment, but it feels silly to compute 3.567 + 829.7/1024 to find out if you need one or two DVDs. Just adding 3.567 and 0.8297 simply gives the wrong answer! The "1.44 MB" Floppy disk (which has a capacity, not 1.44 MB or 1.44 MiB, but 1.44*1000*1024 = 1.47MB = 1.41MiB, exactly double that of the "double-density" 720MiB floppy disk!) probably is a victim of the comparative difficulty of making arithmetics in the base-2-world.

But of course, base-2 to is to computer science what Latin is to medicine...

Other external links

  • The Wikipedia article on Kilobyte previously took the same standpoint as this article, which is probably not correct for an encyclopedia article. Presently, it says "Kilobyte ... equal to either 1,000 bytes (10^3) or 1,024 bytes (2^10), depending on context.", probably as some compromise. This is bad; it is simply not a statement for which a consent exists, see the next entry, in particular its Footnote 1.
  • This article, written by a 1KB=1024B proponent, quite well illustrates the mess the you get into with the 1024B interpretation. Of course, this was not the author's intention. Footnote [7] is entertaining ("Nearly everyone (including the experts) gets this wrong..."). Interestingly enough, the author suggests using "BB" for "billion bytes" instead of "GB", not taking into account that billion is a word that means differently in different regions of the world...
  • A plea for sanity. Comes close to the same article as this one, quite funny.