Primitive Data Types

Generally speaking a data type is either primitive or composite. A primitive data type can either be a basic type that provides the basic building blocks for a language or a built-in type that the language treats and supports as a basic type. Composite types are ones that are a combination of primitive types or other composite types. These may or may not be built into the language you are using. The difference being that a primitive type, like a prime number, cannot be decomposed into simpler types.

“In some places primitive data types mean any that are built in data types.”

Most of your primitive types will be “value types” meaning their value is a datum or specific set of bits. Therefore a value type is the relationship between a set of data and a set of entities sharing the same attributes. Value types do not have a constraint on how the values are stored so primitive types may not all have a direct relationship to objects in memory.

In some languages (C#, Swift, etc.) the term value type refers to the way the value is assigned. When creating a value type the full amount of memory used by that time is pulled. So if it is an int32 then 32 bits will be allocated for that variable. Value types are stored in the “stack” memory. If the value of a variable here changes new memory is not allocated as it is a fixed size. However, if copying the value to a new variable it will create a new copy of that value.

“There is a whole rabbit hole to go down concerning how data is stored, copied, and manipulated in memory.”

These are opposed to reference types which point to a reference to the value. These types are not fixed in size and therefore if a the value of a reference type changes a new area of memory has to be allocated, however if it is copied then only the reference needs to be copied to the new variable. These are stored on the “heap” but have references stored in the stack.

Episode Breakdown

Integers

15:09 Mathematical Integers

An integer holds a mathematical integer up to a certain size. The range of numbers held in an integer will depend on the byte/bit size of the type. This range scales exponentially as each bit added doubles the size of the range.

“Bear in mind you get a block of 8 bits.”

It will be 2(number of bits) – 1. Each bit represents a multiple of two (on or off). This is then raised to the total number of bits. One is subtracted from that total to include 0. The maximum 8 bit (28 ) – 1 integer is 255, whereas the maximum 16 bit (216) – 1 integer is 65,535.

16:25 Signage

“Which is why sometimes you’ll get weird overflow errors…like old Nintendo games.”

These may be signed or unsigned. If they are signed then the first bit in the sequence will be associated with the sign of the integer. It will be either positive or negative and the range will be between the the lowest negative and the highest positive number with 0 as the median.

Instead of 2(number of bits) – 1 the range will be -2(number of bits -1) to 2(number of bits – 1) – 1.

“Don’t worry, it gets worse.”

As you see one bit is taken to designate the signage. Unsigned integers are treated as always positive starting at 0. The memory states are reversed for negatives on signed integers.

00000000 = 0

10000000 = -128

01111111 = 127

11111111 = -1

21:08 Storage

Typically integer literals are written as regular numerals with a sequence of digits and if negative a minus sign in front. Most languages don’t allow commas for digit grouping. Integers can be written as hexadecimal (base 16) values.

Floating Point Numbers

22:50 Not Real Numbers

“If you think you’ve got it, just wait for the next round.”

Floating point numbers are rational numbers that may have a decimal or fractional part. They are not actually the real numbers but representations stored as a formula. The term floating-point refers to the fact that the decimal point (radix point or binary point) can be place anywhere relative to the significant digits.

23:34 Storage

These are usually stored in a variation of scientific notation. The literal for a floating point typically has a decimal and e or E to show scientific notation e.g. 6.022e23 represents 6.022 x 10^23.

“The number that always makes me crave guacamole”

The first number (before the E) is called the significand and it determines the significance or precision of the number being represented. The E represents the base. Typically it is base 2 (binary) or base 10 (decimal) but this can be any number. Finally is the exponent. This is how many times the base is multiplied by itself.

Some bases can’t be used to represent certain numbers. There is no base that you can use to represent all numbers accurately e.g. 1/5 cannot be represented using a binary base, but you can with a decimal or base 10.

Irrational numbers get even more interesting as you need to find a rational approximation. Typically these are formatted using IEEE 754 encoding but that may differ based on the system you are using.

27:19 Precision

“That’s a business definition.”

They have limited precision so not all real/rational numbers are exactly represented. Some numbers have to be approximated with a degree of accuracy.

“I’ve played Kerbal, it’s a good game.”

Single precision floating-point numbers are 32 bit numbers. The first bit is reserved for the sign of the number (positive or negative). The next 8 bits are are the exponent or the multiplier. The final 23 bits are called the fraction, where the significand is stored.

Double precision floating-point numbers are 64 bit numbers. Like single precision the first bit is reserved for the sign. The next 11 bits are the exponent and the final 52 bits the fraction.

30:00 Complex Numbers

To get really crazy some languages like Python also have complex numbers which are made up of two floating point numbers. One is real and the other imaginary. If you don’t recall from high school math an imaginary number is one multiplied by i where i = √-1.

Fixed Point Numbers

31:56 Fixed Digits

“This is the trick that we used to use instead of fighting with floating point.”

These are also rational numbers that may have a fractional part. They have a fixed number of digits after the decimal. Some may have a fixed number before as well. They can improve performance and accuracy over floating points. They are most useful for representing base 2 and base 10 fractions.

33:00 Notation

The notation for representing the size of a fixed-point number can be confusing as there are several ways to represent it depending on what is being controlled.

In the Q(number format) that follows f is the number of bits in the fractional part, m the number of integer or magnitude bits, and s the sign bit.

The “Q” prefix such as Qf (Q15) shows that there are f number of fractional bits. This doesn’t denote word length. That is assumed to be either 16 or 32.

Less ambiguous is the Qm.f (Q1.30) format where m is the magnitude or integer bit. Based on the the number of bits in the m.f the fact there is a sign can be inferred. Q1.30 denotes a 1 integer bit, 30 fractional bit, and 1 sign bit fixed point number. s:m:f format gives the number of bits for each sign, magnitude, and fraction.

36:00 Precision

Just like floating point numbers fixed point have limited precision. Only a set of real/rational numbers are represented accurately others are approximations.

Fixed point arithmetic can have products with more bits than the operands causing information loss. Answers must be rounded or truncated. This causes problems in choosing which bits to keep and which to drop or round. Fixed point numbers have a more limited range of values than floating point.

Boolean

36:20 Logic Types

“Different languages treat this differently.”

These are logic types that can either be true or false (or null if it is nullable). Though only one bit is required for true/false dichotomy many languages associate a whole byte to a boolean. Some languages treat them as their own type while others convert them to a number type.

40:31 Boolean Algebra

You have an output of either true (1) or false (0) with a single input or multiple inputs of either true or false. These inputs go through logic gates to create the output. And Gates output true if both values inputted are true. Or Gates output true if either value inputted is true. Not Gates output the opposite of the value inputted. Various combinations of these create logic circuits.

Characters

42:40 Single Specialized Unit

A character (char) contains a single specialized unit (letter, digit, punctuation, etc.). It is the smallest addressable unit of memory. Usually this is 8 bits. Many standards require that the minimum unit of memory be 8 bits.

“Oh wait, we have international markets.”

The values of a character type precisely represent a code unit. Control characters are those that do not correspond to a specific symbol in natural language. These can be carriage returns or tabs. They are also used for device instruction such as printers or displays.

45:10 Encoding Standards

There are different standards within the industry for encoding characters.

ASCII – American Standard Code for Information Interchange is a character encoding standard. It was developed from telegraph code and the first commercially used encoding. It encodes 128 characters into 7 bit code.

Unicode – An industry standard for electronic encoding of text in most writing systems.

UTF – Unicode Transformation Format is a version of Unicode encoding that can be fixed at 8, 16, or 32 bits.

47:28 Sizing

The size of a char or character varies depending on your encoding and the language you are using. In C a char is a data type that is the exact size of a byte. In other languages since Unicode requires 21 bits a variable length encoding is used such as UTF-8.

47:55 What about Strings?

Strings would classify as composite data types using the definition we chose for this episode. Strings can be implemented in various ways depending on your language. The simplest is an array of characters followed by a delimiting character to signify the end of the string (null terminated).

Some higher level languages do not differentiate between strings and characters. A string with a length of one is used in place of a character. Some contain both character arrays and a string type.

DevSpace: North Alabama's Premier Polyglot Technology Conference

IoTease: Project

EjectABed

Not long ago we talked about the Netduino coming back on the scene and I’ve been looking at projects to play around with one when I found this great idea for getting your kids out of bed when even an alarm clock won’t work. It uses an old hospital bed, servo, Raspberry Pi 2B and of course a Netduino to “gently” eject your kids from bed and get them up and going in the mornings.

Hardware

  • Used Hospital Bed
  • Regular Bed
  • Raspberry Pi 2 Model B
  • Servo
  • Portable Wireless Router
  • Netduino3

Tricks of the Trade

When you see someone else being successful or helping others being successful, it’s very strange how people will try to tear them down. It’s a sign you are in a bad place and that you don’t want to confront that you aren’t doing the work needed to succeed. It’s a sign of a lack of maturity, jealousy, and frankly is simply a viewing of someone else’s highlight reel, rather than the reality of how they got there.

Editor’s Notes:

Tagged with: , , , , , , , , , ,
One comment on “Primitive Data Types
  1. Curtis Carter says:

    https://en.wikipedia.org/wiki/Klingon_alphabets talks about Klingon being rejected in 2001, but per http://www.klingonwiki.net/En/Unicode a new request was submitted in 2016 stating that tolkien alphabets are on the Unicode roadmap and that klingon is in active use as there are several fonts that use it (http://www.evertype.com/standards/csur/klingon.html) and Paramount uses it for Star Trek Movies.