Composite Data Types
Podcast: Play in new window | Download (51.5MB) | Embed
Subscribe: Apple Podcasts | Spotify | Email | RSS | More
Data types are used in programming to categorize data. Be they in a strongly typed language like C, C++, Java, in a database using SQL or NoSQL, or in a loosely typed language like Ruby or JavaScript (in which there are underlying datatypes) they differ according to the language or database being used. They are primarily used for system optimization and error prevention.
Generally speaking a data type is either primitive or composite. A primitive data type can either be a basic type that provides the basic building blocks for a language or a built-in type that the language treats and supports as a basic type. If you haven’t done so go back and listen to our episode on Primitive Data Types. Composite types are ones that are a combination of primitive types or other composite types. These may or may not be built into the language you are using.
There are a lot of composite data types, do a google search and you’ll find too many to even briefly discuss in one episode. We selected a few of the most common of the composite data types. They may be called different things depending on your language and so we tried to point that out where possible (we don’t know every language out there). There is a lot to learn when it comes to how information is taken in, passed around, and used in every language. This has been an overview of some of the ways to do that and we plan to go into further detail on each of these types.
Episode Breakdown
Arrays
An array is a collection of values with an index or key for each. Data is stored so that the value can be found quickly from the index.
09:45 Indexes
Indexes are generally positive integers. Most of the time the index starts at 0 and goes to one less than the total number of values in the array (n – 1). This is called zero-based indexing. One-based indexing starts with the first value indexed as 1.
“They’re normalizing the zero, but it’s basically syntactic sugar under the hood.”
N-based indexing allows for the base or first index/key to be chosen. Some languages even allow for negative indexes. The foundation address is the memory address at index of 0, or the first item.
11:10 Dimensions
One-dimensional Arrays, also known as a linear array, have values that can be found by a single subscript (array[0]). Multi-dimensional arrays are similar to matrices in math. The simplest is a two-dimensional array where the first subscript is the row and the second is the column. They can go as high as your language will allow.
11:49 Limitations
Arrays have a few limitations. The size of the array is fixed (in most languages). Inserting an new element is expensive as room must be made for it.
Tuples
In math tuples are finite ordered lists (like arrays) or elements. An n-tuple is a sequence of n elements. Mathematicians usually write them inside parentheses separated by commas.
13:15 Size
The term tuple comes from the use of terms like single, double, triple, etc. A tuple can be any size so it the term represents being any of them. 0-tuple () is a null tuple (it has nothing in it), 1-tuple (a) is a singleton, 2-tuple (a,b) is an ordered pair, and 3-tuple (a,b,c) is a triplet.
14:00 Data Representation
They can represent data in a record. Components can represent individual fields of that record. They are used to provide easy access and manipulation of data. They can also be used to provide output of a method or easy input without multiple parameters.
“The example I have is Tuple TheDude = (“Jeffrey Lebowski”, 48, 212.5)”
As opposed to arrays you can return data of different types within the same tuple. Can have string, int, decimal, etc.
15:50 Languages
“Try looking up tuples and not get all Python results”
They are used heavily in lower level languages and in data science languages like python. Tuples are immutable so once it is created in memory it cannot change. While they cannot be changed they can be made to reference other tuples. In python you can assign variables to a tuple by placing them to the left.
(birth_date, death_date, married, children, first_name, last_name) = person
Linked Lists
Collections of values that are not stored in a contiguous location. Each record is called an ‘element’ or a ‘node’.
17:30 Elements
Elements in a linked list are connected (linked) using pointers. Each element contains the data for that element and pointer information to connect it to the list. Since the memory is not contiguous they can grow or shrink as needed.
19:05 Types of Linked Lists
A few different types of linked lists exist. Singly linked lists have elements with a data field and a next field. The data field hold the information for that element in the list. The next field has the pointer to the next element in the list.
“The only time you’ll ever see a singly linked list is in homework.”
A doubly linked list contains the same as a singly linked list as well as a pointer to the previous node. Multiply linked lists have two or more fields that link the node to other nodes in the list. Each link connects the same set of data differently. In circularly linked lists the final node instead of having a null pointer points back to the first node in the list.
Some linked lists use sentinal nodes. These are empty elements as first nodes to start the list. This ensures that even in an empty list there will be a first and last node.
22:52 Insertion
There are three ways to insert into a linked list. First is by replacing the head or first element of the list. Next you can insert after a given node. You must change the pointer on the previous element to point to the new element. Then you must have the new element point to the next element in the list. Finally you can add to the end of the list.
25:00 Deletion
Deleting a node from the list is a bit more complicated. First you have to find the element previous to the one to be deleted. Then you have to change to pointer to point to the element after the one being deleted. Finally you can free up the memory of the element you are deleting.
27:15 Advantages and Disadvantages
They have some advantages over to arrays. They are not fixed in size and can grow or shrink as needed. Because they are not contiguous it is easy to add or remove elements. Dynamic structures such as Ques and Stacks can be implemented using Linked Lists.
“You can run a battery down quick if you aren’t efficient.”
However they also have some disadvantages. Elements have to be access sequentially slowing down searching. They require extra space in memory for the pointer to the next element. Non contiguous storage of nodes requires more time to access them. Reversing or going backward in a list can be difficult with singly linked lists.
Strings
“In computer science a string is any finite sequence of characters” ~www.linfo.org/string.html
31:00 Sequences of Characters
Length is the most important characteristic of a string. It tells how many characters in the string. Empty strings have a length of 0.
“Took me about an hour or so to figure our I’m an idiot.”
Substrings are contiguous sequences of characters within strings. Each character is represented by a number (ASCII/Unicode). In the string “Hello World” the character H is represented by 72 in decimal (base 10) and 48 in hexadecimal (base 16). That number is held in the computer as a byte (8 bit unit) of binary (1’s and 0’s).
Many times strings are treated like arrays of characters with an index for each character and an assigned size for the string. Null terminated strings are stored with a null character as the final character representing the end of the string. Byte terminated strings use a special character such as $ (assembler) or : (CDC systems). Bit terminated strings use a “word mark” to represent the end of the string. IBM 1401 used this as it’s terminal deliminator since ASCII didn’t use the high-order bit in a seven-bit word. Some languages store the string length in the prefix of the string.
35:00 Manipulation
Concatenation is the combination of two or more strings into one string. The opposite of concatenation is to split a string or break it into component strings. Splitting a string breaks it down into multiple smaller strings. This can be useful when you need to get specific information. Prefixes and Suffixes are parts of strings at the beginning or end respectively.
“String xyz is ‘ManBearPig'”
Strings may also be rotated or reversed. Searching strings typically involves a needle in the haystack approach. Looking for a substring within a larger string. Searching for the word “to” in “To be or not to be that is the question…”.
42:10 Various Languages
Strings are treated differently in various programming languages. In Some (C#, Java, Python) programming languages at a low level strings are immutable. Unlike numbers that take a set amount of memory no matter the actual number strings only take what they need because they could be rather large. Therefore any manipulation of a string creates a new string because you may or may not have contiguous memory. In other languages (C++, Ruby) strings are seen as templates and therefore mutable.
Dates and Times
These can be Dates, DateTimes, TimeStamps, etc. based on what language or system you are using. Date is used when the exact time is not needed. DateTime and TimeStamp are used when both the date and time are needed.
44:45 TimeSpan
“That cookie is going to get it’s time from what ever browser it’s in.”
Timespans are measurements of time between two points. This is what happens when you subtract one DateTime from another. Good for measuring expiration of things like tokens or cookies.
46:03 Representation
They are typically represented in the number of milliseconds from an epoch. The most recent epoch started on January 1, 1970. Some languages can only go back as far as that date. Other languages store the epoch information and can go back even further.
46:30 Functionality
In most frameworks and languages you can get a lot of information from the DateTime. Obvious is the Date (year, month, day) and the time. Less obvious is the day of the week or year. You can also get the milliseconds, hour, minute, and second individually.
There are several basic functions you can do with dates and times. You can add or remove timespans (months, hours, days). You can compare multiple dates and times to determine in what order they occur. Also you can parse it to a string and add formatting to it.
Records: Structures and Classes
These are collections of fields. They can contain different data types. Usually they have fixed number and sequence. Fields within a record may be called member, elements, etc.
48:42 Collections of Data Types
They are a collection of data types with identifiers. Typically they have a value/variable relationship or even a key/value relationship. Where they specify the data type of the field and the key or identifier. Some fields may contain functions or procedures that return the data to that type.
49:37 Structures
“It’s basically just here is a sequential chunk of state.”
A structure or struct is a type in several languages that holds a record. In Object Oriented Programming (OOP) it is called a PODS (Plain Old Data Structure). It contains a collection of other data types within it.
50:48 Classes
In OOP a class is a template for creating an object. It not only contains data but also methods of using that data (behavior).
Constructing a class will provide the initial values for state and behavior. Most languages use the same name for the class and the constructor.
Classes can be composed of other classes creating a compositional relationship. A person class could have a home address of type Address class. Classes can also inherit from one another so a Car class can inherit from a vehicle class.
IoTease: Project
Book Case for Raspberry Pi
Not exactly IoT but this is a fun project for housing your Raspberry Pi. They built a case to look like a small book. I really like the steampunk feel of the book and how they added some gears and the pi to the front. Check it out, the project itself is kinda lacking on specific details but has great images of the process of building the case.
Tricks of the Trade
Yes, stuff changes. Maintain calm about things that MIGHT happen and deal with the things that do. That doesn’t mean that you shouldn’t advocate for positions that you espouse, only that you should be a little more cautious about how you present it.
Editor’s Notes:
Please forgive the stuffiness and occasional sniffles as Will was recovering from a mild head cold.