Introduction
In the previous article we discussed about how OCaml is different than other languages, its expression oriented nature, its static type system & some other aspects.
In this article we will mainly discuss about the data-types in this language.
But before we directly head on to the main topic, let me share some insights about the tools we will be using.
Running & Compiling
Utop(Universal Top-Level for OCaml) is similar to IPython3 for python with some different features.
Although you can manually install utop with the default package manager apt(assuming you are using ubuntu,. If you have windows computer, use WSL2 instead). I recommend you to follow each and every step described in the webpage(link below) to install opam, OCaml, and setup an opam switch (similar to virtual environments of python3).
Click here to see the steps. It may seem time consuming, but trust me it's worth it.
now once you have installed & set up ocaml, opam and utop, you must know how to run an existing ocaml file.
Run without compiling:
If you're not using any other library rather than the standard library, you can then only run without compiling the code. To do that -
ocaml filename.ml #place the appropriate path to the ocaml file in place of `filename.ml`
Compiling and running:
The compiler we use to compile and run ocaml programs is ocamlc
. The compiler first compiles the source file into byte code then we manually run the binary byte file.
ocamlc -o filename.byte filename.ml
./filename.byte
If you observe enough you'll see, two more files are also generated, filename.cmo and filename.cmi. These are not used in running code. These are for different purposes. We don't need them now. So clean them using -
rm filename.cmi filename.cmo
Heading Over to the Main Topic
NOTE: We can write multiline comments in OCaml inside
(* *)
- this syntax.e.g. -
(* This is a comment *)
Enter utop in the terminal and get ready to write code.
You should see a similar looking interface after running utop(img above).
;;
(double semicolon). Whenever you'll evaluate an expression in utop, it will show the resulting value and type in the next one or few lines.Primitive Expression Types
The primitive types are unit, int, char, float, bool, and string.
Unit: Singleton Type
The unit type is the simplest type in OCaml. It contains one element: ( )
. Seems stupid, right? Actually not!
In an expression oriented language, every expression must return a value. Then what about those expressions which perform side effects?
( ) is used as the value of a procedure that makes any side-effect. It is similar to the void data type in C.
print_endline "Let's Learn OCaml";;
(* This expression prints the specified string to the screen.
Printing something to screen is seen as a side-effect.
So, this expression will return a unit. *)
Int: Integers
This is the type of signed Integers. All positive integers(1,2,3,4, ...), all negative integers(... ,-4,-3,-2,-1) and 0 are recognised as integers.
OCaml integers range from -
$$-2^{62}\ \ to\ \ 2^{62} - 1$$
on modern computer systems.
let num = 5;; (* integer expression *)
val num : int = 5 (* utop output *)
int described in binary - starts with 0b
int described in octal - starts with 0o
int described in hexadecimal - starts with 0x
Float: Floating-Point Numbers
The syntax of a floating point requires a decimal point, an exponent (base 10) denoted by an โEโ or โeโ. A digit is required before the decimal point, but not after. Let's look at some examples -
31.415926E-1;; (* float value *)
- : float = 3.1415926 (* utop output *)
let number = 2e7;; (* float expression *)
val number : float = 20000000. (* utop output *)
(* float expression with unnecessary type annotation*)
let floating:float = 0.01;;
val floating : float = 0.01 (* utop output *)
Char: Characters
The expression type char
belongs to the ASCII character set. The syntax for a character constant uses the single quote symbol. e.g. - 'a'
, 'x'
, 'F'
, ' '
etc.
But there's more to know! Escape Sequences though commonly associated with strings, they're also expressed as char
.
Must Know Escape Sequences:
Sequences | Definition |
'\\' | The backslash character |
'\'' | The single-quote character |
'\t' | The tab character |
'\r' | The carriage-return character |
'\n' | The newline character |
'\ddd' | The decimal escape sequence |
A decimal escape sequence must have exactly three decimal characters. It specifies the ASCII character with the given decimal code.
Let's see some examples -
let ch = 'x';; (* char expression *)
val ch : char = 'x' (* utop output *)
'\123';; (* decimal escape sequence value *)
- : char = '{' (* utop output *)
'\121';; (* decimal escape sequence value *)
- : char = 'y' (* utop output *)
String: Character Strings
In OCaml, strings are a primitive type represented by character sequences delimited by double quotes. Unlike C, OCaml strings are not arrays of characters and do not employ the null-character '\000'
for termination. Strings in OCaml support escape sequences for specifying special characters, akin to those used for individual characters.
let str = "Hello\n World!";; (* string expression *)
val str : string = "Hello\n World!" (* utop output *)
(* The Absolute Nightmare way to write an helloworld program *)
let greet = "\072\101\108\108\111\044 \087\111\114\108\100\033";;
val greet : string = "Hello, World!"
Bool: Boolean Values
The bool
type includes true
and false
, and logical negation is done via the not
function. Comparison operations (=
, ==
, !=
, <>
, <
, <=
, >=
, >
) return true
if the relation holds; ==
is used for checking physical equality, while =
implies structural equality.
Boolean Expression | What does it signify |
x = y | x is equal to y |
x <> y | x is not equal to y |
x == y | x is "identical" to y |
x != y | x is not "identical" to y |
x > y | x is strictly greater than y |
x >= y | x is greater than or equal to y |
x < y | x is strictly less than y |
x <= y | x is less than or equal to y |
If you're someone experienced in python, java or C++, you have to practice using =
in conditions, instead of ==
.
5.1 = 5.1;; (* boolean expression checking structural equality *)
- : bool = true (* utop output *)
5.1 != 5.1;; (* boolean expression checking physical inequality *)
- : bool = true (* utop output *)
Type Conversion
OCaml provides some functions to convert some primitive types to another.
From _ to int :
โ use - int_of_string
int_of_string "145";;
- : int = 145
โ use - int_of_char
int_of_char 'o';;
- : int = 111
โ use - int_of_float
int_of_float 1.9999999;; (* returns the floor value of the float *)
- : int = 1
โ use - Char.code
Char.code 'd';; (* Char is a module which has a function named
`code` to do this *)
- : int = 100
From _ to float :
โ use - float_of_int
float_of_int 52;;
- : float = 52.0
โ use - float_of_string
float_of_string "5";;
- : float = 5.
float_of_string "0.5";;
- : float = 0.5
From _ to char:
โ use - char_of_int
char_of_int 55;;
- : char = '7'
char_of_int 97;;
- : char = 'a'
char_of_int 67;;
- : char = 'C'
โ use - Char.chr
Char.chr 45;;
- : char = '-'
Char.chr 105;;
- : char = 'i'
From _ to string:
โ use - string_of_int
string_of_int 746;;
- : string = "746"
โ use - string_of_bool
string_of_bool true;;
- : string = "true"
โ use - string_of_float
string_of_float 45.0;;
- : string = "45."
From _ to bool:
โ use - bool_of_string
let wrong = bool_of_string "false";;
val wrong : bool = false
(* `bool_of_string` only works if the provided string is "false" or "true" *)
bool_of_string "";;
Exception: Invalid_argument "bool_of_string". (* throwing an exception/error *)
Custom Types
We can define custom data types using a type definition with the type
keyword. These are also called variants.
Example -
(* Defining a type representing different days of the week *)
type day =
| Monday
| Tuesday
| Wednesday
| Thursday
| Friday
| Saturday
| Sunday
;;
(* `|` is a symbol in OCaml that seperates different patterns or cases.
It is mainly used in type definitions and pattern matching code.*)
(* utop output *)
type day = Monday | Tuesday | Wednesday | Thursday | Friday | Saturday | Sunday
Composite Data Types
Lists
Lists are homogeneous collections represented by square brackets. They are immutable and support powerful pattern matching operations, making them essential in functional programming.
(* Defining a list of integers *)
let numbers = [1; 2; 3; 4; 5];;
val numbers : int list = [1; 2; 3; 4; 5]
Arrays
Arrays in OCaml, denoted by the array
keyword, are fixed-size collections of elements of the same data type. They are zero-indexed and can be accessed using square brackets.
let numbers = [|1; 2; 3|];; (* defining an array of integers *)
val numbers : int array = [|1; 2; 3|] (* utop output *)
Tuples
Tuples are ordered collections of elements of different types. They offer a convenient way to group heterogeneous data. A parenthetical space () separates the tuple's components from one another.
(* Defining a tuple *)
let credentials = ("Debajyati", 6);;
val credentials : string * int = ("Debajyati", 6) (* utop output *)
(* matching the pattern to access individual elements *)
let (name, roll) = credentials;;
val name : string = "Debajyati" (* utop output *)
val roll : int = 6
(* printing a string with those values *)
Printf.printf "Roll no. of %s is %d\n" name roll;;
Roll no. of Debajyati is 6 (* utop output *)
- : unit = ()
Records
Records are labeled collections of fields, akin to structs in other languages. They allow for structured data representation and manipulation.
(* Defining a record representing a person *)
type person = {
name : string;
age : int;
};;
type person = { name : string; age : int; } (* utop output *)
(* Creating a person record *)
let john = { name = "John"; age = 30 };;
val john : person = {name = "John"; age = 30} (* utop output *)
(* Accessing fields of the record *)
let () = Printf.printf "%s is %d years old\n" john.name john.age;;
John is 30 years old (* utop output *)
Algebraic Data Types (ADTs)
Algebraic data types in OCaml are a way of defining composite types by combining simpler types using constructors, through variant types and recursive types, respectively.
Variant Types
Variant types enable the creation of sum types, where a value can be one of several possibilities. They are particularly useful for modeling complex data structures and handling multiple cases in pattern matching. We already saw an example of Variant Types in OCaml in the Custom types section of this blog. Let's see another example -
(* Defining a variant type representing shapes *)
type shape =
| Circle of float
| Rectangle of float * float;;
type shape = Circle of float | Rectangle of float * float (* utop output *)
(* Creating instances of shapes *)
let circle = Circle 5.0;;
val circle : shape = Circle 5. (* utop output *)
let rectangle = Rectangle (3.0, 4.0);;
val rectangle : shape = Rectangle (3., 4.) (* utop output *)
Recursive Types
Recursive variant types allow for the definition of recursive data structures, such as linked lists and binary trees. One basic example using linked lists -
(* Defining a recursive list type *)
type 'a mylist =
| Empty
| Cons of 'a * 'a mylist;;
type 'a mylist = Empty | Cons of 'a * 'a mylist (* utop output *)
(* Creating a list of integers *)
let rec int_list = Cons (1, Cons (2, Cons (3, Empty)));;
val int_list : int mylist = Cons (1, Cons (2, Cons (3, Empty))) (*utop output*)
'a
represents a type variable, indicating that the type of elements in the tree can be any type. It's a placeholder for a concrete type that will be specified when the tree is instantiated.Empty
represents the end of a list, indicating that there are no more elements left. Cons
represents adding an element to the front of a list, combining the new element with the rest of the list.Option Types
The option data type, denoted by the 'option' keyword, is used to represent values that may or may not be present. It is particularly useful for handling null or undefined values.
Example:
let maybe_number: int option = Some 42;;
val maybe_number : int option = Some 42 (* utop output *)
Module Types
Wait, there exists such things like a Module Type?? Wow! ๐ซก
In OCaml, modules provide a way to encapsulate related code, data, and types. They serve as containers for organizing and structuring code, much like namespaces in other languages. Module types, then, define the interface or signature of a module, specifying the types and functions that must be implemented by any module that conforms to it.
What is the Use Case? Why Even Use Module Types?
Module types play a crucial role in enforcing abstraction and modularity in OCaml programs. By defining interfaces through module types, developers can separate the concerns of implementation details from the external interface.
Defining Module Types
To define a module type, we use the module type
keyword followed by a name and a set of specifications. These specifications include the types and functions that the module must provide. For instance, consider a module type defining the interface for a stack data structure:
module type StackType = sig
type 'a t
val empty : 'a t
val push : 'a -> 'a t -> 'a t
val pop : 'a t -> 'a option * 'a t
end
;;
(* utop output - same actually ๐ฅฒ *)
module type StackType =
sig
type 'a t
val empty : 'a t
val push : 'a -> 'a t -> 'a t
val pop : 'a t -> 'a option * 'a t
end
Here, StackType
is a module type specifying that any module implementing it must define a type 'a t
representing a stack, as well as functions empty
, push
, and pop
for stack manipulation.
Implementing a Module with the Module Type We Created Now
Once a module type is defined, we can create modules that adhere to it by providing concrete implementations for its specifications. For example, we can implement the Stack
module type as follows:
module Stack : StackType = struct
type 'a t = 'a list (* Instantiating the variant type *)
let empty = []
let push x s = x :: s
(* pattern matching expression used to define an expression *)
let pop = function
| [] -> (None, [])
| x :: xs -> (Some x, xs) (* `::` is the Cons operator *)
end
;;
module Stack : StackType (* utop output *)
So, finally we are done. Now you know all the most important and noteworthy data types in OCaml.
Conclusion
Mastering data types in OCaml is essential for writing maintainable & efficient code. From primitive types to algebraic data types and module types, OCaml has many tools for data manipulation & abstraction.
In future blog posts, we will explore advanced topics such as recursion, higher-order functions, and OCaml's module system. Stay tuned!
Until then you can connect with me on twitter :) & share this article with your friends!
Most importantly, - Happy Coding! ๐ง๐ปโ๐ป ๐ฉ๐ปโ๐ป