
Data Structure describes how data is to be stored and organised in the computer’s memory so that it can be used effeciently. It’s the implementation part of the TV interface, which was discussed in the earlier article about ADTs, which nobody thinks about. Data structures convey meaning about the relationships between various pieces of data. It makes sense of the data being presented.
Whether you’re a programmer or not, you’re surrounded by structured data. Lists, dictionaries, and catalogs all help us organize information and take into account the relationships between various pieces of data.
At a certain level of detail, it starts to become necessary to formally describe the properties of the data we work with, giving rise to concrete specifications called data structures.
In programming, data structures are everywhere. The way you organize data into variables in order to use them in programs involves anywhere from dozens to millions of data structures. If you’re a developer, you’re probably familiar with common data structures like arrays, objects, graphs, etc. These structures are often linked together: for example, in a common data structure known as a linked list, each item indicates where to find the next item in a computer’s memory.
Wikipedia precisely describes a data structure as a collection of data values, the relationships among them, and the functions or operations that can be applied to the data. They provide a way of implementing ADTs in data-handling modules. Data Structures are the main part of the computer science algorithm. Selecting an ADT implementation that best uses data structure is important for creating an efficient computer program.
Data Structures are classified as below.
Primitive Data Type
- Integers
- Floats
- strings
- Booleans
Non-Primitive Data Type
A)Linear
- Static
- Arrays
2.Dynamic
- i.Linked List
- ii.Stack
- iii.Queue
B)Non-Linear
- Tree
- Graphs
Hash Table
1.Primitive Data Type
Primitive data types are pre-defined data types inside the given programming languages. Integers, Floats, Strings and Booleans are few examples of primitive data types.
•Integers: Zero, positive or negative numbers without fractional parts are integers. eg. 10, -20, 0 etc. Operations performed on integers are addition , subtraction, multiplication and division.
•Floats: Zero, positive or negative numbers with fractional parts are Floats. eg. 1.009, -20.00, 0.1999 etc. Operations performed on Floats are addition , subtraction, multiplication and division.
•Strings: Strings are text. eg. ‘Gaurav’, ‘myraah’ etc. Operations performed on strings are Upper case, Lower case etc.
•Booleans: Booleans represent one of the two values: True or False. They are mainly used when checking conditions. eg. ’10 is not equal to 10′ is False. Operations are XOR, OR, >, <, = etc.
2.Non-Primitive Data Type
Non-primitive data types or object oriented data types are user defined data types that store different data types in a single entity. As an example consider list A=[1, ‘Apple’, 2.21, False].
Here A contains integer, string, Float and Boolean inside it. hence A is a non-primitive data type. Now non- primitive data types are further divided into linear and non-linear data types.
Linear Data Type: Linear data type follows arrangement of elements in a sequential manner. Each element in linear data type is linked to it’s previous and next element. Since the elements are arranged linearly, it becomes easy to access the elements in a single run. The time complexity of structure increases with increase in size of data structure. Linear data types consist of Arrays, Linked lists, Stacks and Queues. Arrays are static in size i.e. once the size is fixed it cannot be increased further in the program. Linked lists, Stacks and Queues are dynamic in size i.e. size can be increased by adding new elements in the program. Let’s discuss these data structures one by one.
•Arrays:Arrays are static in size. Arrays are collections of similar types of data. There can be an array of integers, floats or strings. Arrays consist of alloting sequence of space in computer memory and then filling elements in that empty space. The sequence is ended with a special Null token. The element in the array can be accessed using index. One can implement stacks, lists and queues using arrays. Removing an element from the middle of the array might cause problem since then, you’ll have to shift each subsequent elements one step back.
•Linked Lists: Linked lists are dynamic in size. The need for linked lists arises because of the fixed size of arrays. If you create an array having a maximum number of 5 cells, then you are only allowed to add 5 elements in the array. You cannot add more than that. If you create an array having 50 cells but you are only able to fill only 20 of them then the memory corresponding to those 30 cells gets wasted. To overcome these difficulties, Linked lists were introduced. In Linked lists, elements are stored in a chain of cells that don’t need to be at a sequential memory address. Memory for each cell is allocated as needed. The address to the next cell is indicated using a pointer. The cell with empty pointer marks the end of the chain. The only disadvantage is that it’s very hard to retrieve nth item in linked list as you have to start searching from the 1st cell, use it to get the address of the 2nd cell, then use 2nd cell to get the address of the 3rd cell and so on until we finally get to the nth cell. Also it is difficult to backtrace the cell if we have the address of just one cell. Though backtracking is possible in doubly linked lists, it consumes extra memory to store the back pointer. The doubly linked list is the linked list with an extra cell having two pointers: one to the cell before and another to the cell after it. Along with a doubly linked list, there is a circular linked list and doubly circular linked list. In circular linked list pointer on last cell gives address of 1st cell whereas in doubly circular linked list pointer on 1st cell gives address of last cell and pointer on last cell gives address of 1st cell.
Non-Linear Data Type: In this data type elements are not arranged in a sequential manner but rather the elements are connected to more than two elements which causes difficulty in accessing the data.
More time is required to access the data. Trees and Graphs come under non-linear data structure. Let’s go through these data structures one by one.
•Tree:
Similar to linked lists, trees also employ memory cells that do not need to be contiguous in physical memory to store objects. Tree also has pointers which give the address of the next cell. The only difference is that in the link list, cells are arranged sequentially but trees have a structure similar to an actual physical tree. cells in trees are called nodes, pointers are called edges. Top most node is called root or parent node. Siblings are two nodes that have the same parent. Nodes with no children are called leaf nodes. A Path is the gap between two nodes and edges that can lead from one node to another. Level of the node represents the generation of the node. A set of trees is called a forest. One special type of tree is the binary search tree. Let x represent the parent node and y, z represent the children nodes. If y< x and z > x then y will be on the left and z will be on the right of parent node x. If we have to search for the node with a given key value, then this property of the tree makes the job easy. The complexity of searching an item is proportional to the height of the binary tree. This complexity is removed by doing Tree Balancing. As an example If we go on inserting nodes with value greater than the previous node then, we will end up getting a structure similar to a linked tree. This can be rearranged in order to reduce the height. This is called tree balancing. Some self-balancing trees are red-black Tree, Adelson-Velskii and Landis (AVL) Tree and B-Tree. These data structures are mainly used in maps and sets.
•Graphs:
Graphs are similar to trees except they don’t have parent or children nodes. Data can be arranged freely as nodes, which are known as vertices and edges and any node can have multiple incoming and outgoing edges.
This makes graphs more flexible. Let’s elaborate more on the structure of a graph.
- Two nodes connected by a ray are called directed edges, where the arrow shows the address of the next node.
- Two nodes connected by segments are called undirected edges.
- Graphs using directed edges are called directed graphs.
- Graphs using undirected edges are called undirected graphs.
As an example, graphs are used in highway Network, Flight Network, Internet, Local area network etc.
3.The Hash Tables
Hash tables are often used as a linear data structure but they can also be used as a non-linear data structure. Hash tables are similar to arrays but unlike arrays, elements are not stored in an ordered sequence. The position an element occupies is given by a hash function. That special function takes data you want to store as input, and outputs a random looking number. This number is interpreted as a memory location. This allows us to access the data instantly. As a disadvantage, sometimes the hash functions return the same memory position for two different inputs. This is called hash collision. Ideal hash function returns random looking values for different inputs. Hash tables are often used to implement sets and maps.
Do follow our LinkedIn page for updates: [ Myraah IO on LinkedIn ]