## Ask Professor Puzzler

Do you have a question you would like to ask Professor Puzzler? Click here to ask your question!

*"Sir , I saw your post about how to convert from alphabet to binary , in which I agree about the post. But there was a question I saw I'm my textbook which state that convert the hexadecimal number 4B3.3 to decimal. In your post here https://www.theproblemsite.com/ask/2016/01/converting-letters-to-binary, I notice that A= 65 and B =66. then I equate 66 to represent B. but after solving it there answer is different from mine .they got 1203.1875, and I got 18019.1875. How come sir? But I notice they equate there B = 11. Sir, I need your highlight on it. Thanks for talking time to read it, waiting for ur reply."*

Hi Abolade, Thanks for asking this question. To understand the answer to that, I need to talk for just a moment about graphemes. A grapheme, according to one dictionary, is "The smallest meaningful contrastive unit in a writing system." That's a fancy way of saying that a grapheme is a symbol that represents something meaningful in a writing system. For example, "7" is a grapheme for the number seven. It's a symbol, and whenever you see it, you automatically know that it represents this many things:

X X X X X X X.

On the other hand, "72" is not a grapheme because it's not the *smallest* graphical unit. "72" is actually two graphemes: the grapheme for the number seven, and the grapheme for the number two.

Graphemes are also used to represent letters. "A" is the grapheme we use to represent the first vowel in the alphabet, "B" is the grapheme we use to represent the first consonant, and so forth. Of course, the graphemes for these letters might look different if you were writing a different language, such as Greek: α β γ δ ε...

So here is where things get confusing. In base ten (our normal counting system), we have ten digits. And therefore, we have ten graphemes: 0 1 2 3 4 5 6 7 8 9. Perfect! We have just enough numerical graphemes! And if we're in base eight (also known as "octal"), we have eight graphemes: 0 1 2 3 4 5 6 7. We have more than enough graphemes (the graphemes for eight and nine don't get used in this base).

In fact, for any base less than ten, we have more than enough graphemes. The problem is when we start talking about bases *greater* than ten. Then we don't have enough graphemes!

So instead of inventing *new *graphemes to represent these digits, mathematicians said, "Why waste the effort developing new symbols, when we've got all these other graphemes lying around not being used?" Specifically, we're talking about the alphabet graphemes. So if you are in base eleven, your graphemes are: 0 1 2 3 4 5 6 7 8 9 A, where A is the grapheme for the number ten.

Similarly, if you are in base sixteen (also known as "hexadecimal"), you have sixteen graphemes, and six of them are stolen from the alphabet: 0 1 2 3 4 5 6 7 8 9 A B C D E F.

It's very important to note that in this context, A, B, C, D, E, and F, even though they look like the graphemes for LETTERS, have a very different meaning; they are now graphemes for NUMBERS!

A is the grapheme for the number ten

B is the grapheme for the number eleven

C is the grapheme for the number twelve

D is the grapheme for the number thirteen

E is the grapheme for the number fourteen

F is the grapheme for the number fifteen

The mathematicians very selfishly pirated letters from the alphabet to take on a completely different meaning. I mean, why not? It's not like there would be any confusion, right? Shakespeare was never going to write a sonnet about the hexadecimal number system, so there's no chance there would ever have an overlap of meaning where we were unsure whether A represents a number or a letter, right?

Wrong! Welcome to the world of computers! In a computer system, even though you can *type* graphemes and a computer can *display* graphemes, the computer does not *understand *graphemes. Computers "think" strictly in terms of numbers, not graphemes. Which means that people who designed computers had to come up with a system of converting graphemes into numbers, so the computer would be able to handle them.

How many graphemes are there? Well, don't forget that not only are there number graphemes and letter graphemes, there are also punctuation graphemes, mathematical operation graphemes, and special symbols like the pipe character, the tilde, etc.

So somewhere along the way, someone (or a committee of someones) had to develop a conversion chart so that whenever someone typed a character on the keyboard, there was a *number* that corresponded to whatever they typed. For example:

33 is the number that represents the exclamation mark grapheme

34 is the number that represents the quotation mark grapheme

43 is the number that represents the addition grapheme

65 is the number that represents the upper case "A" grapheme

66 is the number that represents the upper case "B" grapheme

97 is the number that represents the lower case "a" grapheme

Uh oh...now we have a problem! The grapheme "A" is a representation for the number 10, but 65 is the *computer's* value for the "A" grapheme. And this is the source of the confusion. Consider the following:

BAD = a hexadecimal number

BAD = of poor quality or low standard

Both of these statements are true. BAD is a hexadecimal *number*, but it's also a *word!* In one case, B, A, and D are graphemes for numbers, and in the other case, they are graphemes for letters! If you wanted to convert BAD (the number) into base ten, you would do this: 11 x 16^{2} + 10 x 16 + 4 = 2980. But if you wanted to convert BAD (the text) into numbers, so the computer could deal with it, you would do this: B = 66, A = 65, D = 68.

So if you're dealing with numbers (as in your textbook), A = 10, B = 11, etc. If you're dealing with text (words), A = 65, B = 66, etc.

If only those lazy mathematicians had invented their own graphemes instead of just stealing from the alphabet, we wouldn't have this confusion!

Thanks for asking, and I hope there was something helpful to you in this lengthy explanation!

Professor Puzzler

Louis from Uganda asks, "What is the place value of 3 in 131 base three?"

Well, Louis, funny thing is, I can't answer that question. Because 131 is not a base three number. You see, if you are working in base n, there is no symbol for the number n. In base ten, for example, we have symbols for 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9, but there is no symbol for the number 10. We require *two *symbols to represent that number (the symbol 1 and the symbol 0).

In base three, there is no symbol for the number three. "3" as a symbol does not exist. So your number doesn't actually exist.

It would be like asking me, "What is the place value of ¥ in 75¥23 base ten?" The answer is, that's not a base ten number, because there is no symbol "¥" in the base ten system.

In base three, you have three symbols to work with. They are 0, 1, and 2. Thus, you could have the number 101, 111, or 121, but 131 doesn't exist.

So let's suppose you meant to write "What is the place value of 2 in 121 base three?" In that case, I can answer you fairly easy; that is the "threes place". Whatever number is multiplied by three to find its actual value.

I could give you a more detailed answer, but we actually already have a fairly detailed unit on bases here: Number Theory Unit on Bases, so you might want to take a look at that for more information! If you're wondering why you can't use the symbol "3" in base three, take a look at this page, which elaborates a bit more: Avoiding Ambiguity in Notation.

Thanks for asking, Louis!

I was told that the number 1,000,001, no matter what base you're working in, is a composite number. Is this true?

Hmm...that's an interesting question. Let's look at some sample bases to make sure it's a reasonable statement.

If this was base ten, then we'd have 1,000,001 is divisible by 101, so it's composite.

If this was base eleven, then the number would be equal to 1,771,562 (base ten), which is obviously composite, because it's even!

Actually, now that I think about it, any time the base is an odd number, 1,000,001 represents an even number, so it's definitely true for all odd bases. So let's focus on even numbered bases.

Let's try base eight. This number would be equal to 262,145 (base ten), which is a multiple of 5.

Base six? That's 46,657 (base ten), which is divisible by 37.

Now, I could keep trying more bases, but I just noticed something interesting. In our first example, the number is divisible by 101, which is one more than the square of 10, and in our last example, 46,657 is divisible by one more than the square of six. That's interesting. Your number may always be divisible by one more than the square of the base. Let's test that hypothesis.

In base twelve, this number would be 2,985,985, and if our hypothesis is correct, it will be divisible by 12^{2} + 1, or 145. Pull out your calculator...

It is!

So we have a reasonable extension of your conjecture: *If 1,000,001 is a number written in base n, then it is divisible by n ^{2} + 1.*

Ideally, it would be nice if we could *prove *this extension, because if we could, we'd be closer to proving your conjecture.

Let's write our number 1,000,0001 in terms of the base n:

1,000,0001 = 1·n^{6} + 1.

And suddenly I'm remembering one of my factoring rules - the rule for a sum of cubes:

n^{6} + 1 = (n^{2})^{3} + 1^{3} = (n^{2} + 1)(n^{4} - n^{2} + 1)

Sure enough, 1,000,0001 will always be divisible by one n^{2} + 1!

So in order to finish proving your conjecture, we simply need to show that (n^{2} + 1) can't be equal to 1, and it can't be equal to n. These are the two circumstances which could result in n^{6} + 1 being prime (a number is composite if it has factor pairs other than one and itself).

n^{2} + 1 = 1 has only one solution: n = 0. But we don't work in base zero, so this is irrelevant.

n^{2} + 1 = n, or n^{2} - n + 1 = 0, which has no real solutions, so YES! The conjecture we started with is true!

Thanks for asking - that was an interesting exercise!

Professor Puzzler

Emmanuel from Nigeria asks, "How are letters converted into binary codes?"

Well, Emmanuel, I'm guessing you came here from our Binary Coding Page. If so, you ask an excellent question, because the subject of *how* we convert letters is not really explained there.

The big question is, "How do you convert a letter to a number?" Because if you can convert the letter to a number, then you can use the information on our base conversion page to convert that number into binary. So how does the computer do the conversion from letter to number?

There is a standard listing called the "ASCII Character Set," in which every character used on a computer's keyboard is assigned a number. There are a *lot *of these characters, because there isn't just one for every key on your keyboard - there's also one for each key with the SHIFT key pressed. The ASCII character set has room for 256 characters, numbered from 0 to 255.

You might wonder, "Why 256?" and the answer is, because 256 = 2^{8}, which means anything less than 256 can be written as 8 binary bits (place values). Each character needs to have the same number of binary bits - otherwise nobody would know where one character ends and the next one starts. So even though the number 15 only needs four bits to be written in binary (1111_{two}) in order to make sure all the numbers have the same length, the computer would write it as 0000 1111_{two}. We put a space between every four digts, for the same reason that we do commas in base ten - it helps us read long strings of digits more easily.

Okay, so with that out of the way, now we just need to know what letters are represented by what number. Here's a quick reference for you:

A = 65

B = 66

C = 67

...

X = 88

Y = 89

Z = 90

Remember I mentioned that the ASCII Codes account for the shift key? That means that lower case letters have a different number assigned than upper case letters:

a = 97

b = 98

c = 99

...

x = 120

y = 121

z = 122

Now, there are a couple things you might have wondered about, like "What comes before 65?" and "Why is there a gap between the upper and lower case numbers?"

The answer to the first question is, there are other characters in those gaps - numbers, punctuation, special control characters (like the Backspace, Enter, Delete, etc).

The reason there's a gap between the upper case and lower case alphabets is that it makes "a" 32 more than "A." That is very convenient because 32 is a power of 2 (2^{5}), so changing between upper and lower case means changing just one bit:

A = 0100 0001

a = 0110 0001

So if you're using the computer's ASCII character set, and you wanted to convert "Hello" into binary, you would look up each letter in the ASCII chart:

H = 72 = 0100 1000_{two}

e = 101 = 0110 0101_{two}

l = 108 = 0110 1100_{two}

l = 108 = 0110 1100_{two}

o = 111 = 0110 1111_{two}

So the entire word "Hello" is:

0100 1000 0110 0101 0110 1100 0110 1100 0110 1111_{two}.

BUT...you don't *have *to use the ASCII conversion; you could create your own way of converting letters to numbers. Why would you want to do that? Well, you probably wouldn't, unless you wanted to conserve space, and you didn't care about anything except the basic upper case alphabet.

You see, if all you cared about was the upper case letters (and maybe a space), then you could do a conversion like this:

SPACE = 0

A = 1

B = 2

C = 3

...

X = 24

Y = 25

Z = 26

Why would you want to do that? Because now your biggest number you need to encode is 26, which is less than 2^{5}. That means that you only need five digits to write each number instead of eight! So your encoded message will take up ^{5}/_{8} as much space. You'll save about 38% of the space on the page.

Since you have to have space in your table for 32 characters, and you've only used up to 26, you might as well use the other ones. Maybe include some punctuation?

COMMA = 27

PERIOD = 28

QUESTION MARK = 29

DASH = 30

DOLLAR SIGN = 31

Or you could create a table with 64 characters, which would either let you put in a lot more punctuation, or the numbers, or the lower case alphabet. But now you're using six binary digits per character, so you're not saving as much space.

Or you could completely jumble your character chart, which makes it a harder for other people to decode:

A = 17

B = 3

C = 25

etc...

But if you really want to make a coded message, there are much better ways to do it, so everyone just sticks with the standard ASCII codes in order to keep things simple.

A couple more things:

- The ASCII chart is available online; just go to google and search for "ASCII codes" and you'll get the entire list!
- ASCII stands for "American Standard Code for Information Interchange"

Thanks for asking, Emmanuel. I probably gave you much more information than you were expecting, but I hope you found it both interesting and helpful!

Professor Puzzler

PS - you can find more information about encoding here: Colors, Numbers, and Graphemes.