Build A CLI in OCaml with the Cmdliner Library

Build A CLI in OCaml with the Cmdliner Library

A Project-Based Tutorial for the Cmdliner Library in OCaml

In this tutorial we are building a simple CLI named textsearch which is able to search a provided piece of string (probably a word) in a given text file.

Prior knowledge of components of a dune project is recommended but not required. I will guide you setting up the dune project.

Entire source code of this demo project is available at - Release v0.2.1 · Debajyati/textsearch

This tutorial assumes you have dune preinstalled. If not, then install it via opam -

opam install dune

Summary of the Content Prior to Reading the Article

In this tutorial, you'll learn to build a simple command-line interface (CLI) tool called "textsearch" using OCaml and the Cmdliner library. The tool searches for a specified term in a text file with options for case sensitivity and character order. You'll be guided through setting up a Dune project, configuring the necessary files, writing the search logic, and defining command-line arguments using Cmdliner. The tutorial covers the project structure, dependencies, and detailed explanations of the code components to help you understand and implement the CLI tool effectively.

What is Cmdliner?

Cmdliner is a library that helps you build command line interface programs in OCaml.

It provides a simple and compositional mechanism to create commands and arguments in a declarative syntax.

So, let’s get started!

Create a Dune Project

Let’s start writing code. But before that we need to create a dune project.

Initialize the dune project by -

dune init proj textsearch

We will need the Cmdliner library installed in our system to build and run our project. So, install it via opam.

opam install cmdliner

Great! Now before you write the actual code, you need to setup the dune project by writing the configuration files properly.

Setting Up the Dune Project

At the end of this tutorial the file tree of our project will be like -

gitignore respected filetree of textsearch

Here the textsearch.opam file is not your concern, don’t touch it. This file is generated by dune, edit dune-project instead.
The dune-project file defines project-wide settings.

For now, you don’t need to edit most of the parts of the file dune-project.
Currently the file must look like -

(lang dune 3.17)

(name textsearch)

(generate_opam_files true)

(source
 (github username/reponame))

(authors "Author Name <author@example.com>")

(maintainers "Maintainer Name <maintainer@example.com>")

(license LICENSE)

(documentation https://url/to/documentation)

(package
 (name textsearch)
 (synopsis "A short synopsis")
 (description "A longer description")
 (depends ocaml)
 (tags
  ("add topics" "to describe" your project)))

; See the complete stanza docs at https://dune.readthedocs.io/en/stable/reference/dune-project/index.html

Now change the package stanza to this -

(package
 (name textsearch)
 (synopsis "A simple text search CLI tool")
 (description "CLI tool to search terms in files")
 (depends
  (ocaml (>= 5.2))
  (cmdliner (>= 1.3.0))
  (dune (>= 3.17)))
 (tags
  ("cmdline" "text" "search")))

Nice!
We have two important directories here -

  • lib/ - Holds the source code for libraries within your project.

  • bin/ - Contains the source code for executables.

Forget the test folder for now as we are not going to write test cases for our project (At least not in this tutorial).

We also have a dune file in each of the folders. These files define how to build the specific executable within that directory.
Specifically, they are for -

  • Specifying dependencies (both internal and external libraries).

  • Specifying rules for building the target.

Paste the following code in bin/dune file -

(executable
 (public_name textsearch)
 (name main)
 (libraries search cmdliner))

Paste the code below in the lib/dune file -

(library
  (name search)
 (public_name textsearch))

All good! Finally, the configuration work is DONE! HUH!

We can finally write the actual code.

Writing The Actual Code

Our Internal library

We will need an internal library called search.ml as we have specified in the dune files. Create that file in the lib/ directory.

touch lib/search.ml

Now, paste this code in the file -

let search_in_file ~case_sensitive ~consider_order term file =
  let normalize s =
    if case_sensitive then s else String.lowercase_ascii s
  in
  let term = normalize term in
  let contains_term line =
    let line = normalize line in
    if consider_order then
      try
        let line_len = String.length line in
        let term_len = String.length term in
        let rec check_substring i =
          if i + term_len > line_len then false
          else if String.sub line i term_len = term then true
          else check_substring (i + 1)
        in
        check_substring 0
      with _ -> false
    else
      let term_chars = String.to_seq term |> List.of_seq in
      let line_chars = String.to_seq line |> List.of_seq in
      List.for_all (fun c -> List.mem c line_chars) term_chars
  in

  let ic = open_in file in
  let rec process_lines line_num acc =
    try
      let line = input_line ic in
      let new_acc = if contains_term line then (line_num, line) :: acc else acc in
      process_lines (line_num + 1) new_acc
    with End_of_file ->
      close_in ic;
      List.rev acc
  in
  process_lines 1 [];;

let print_matches matches =
  List.iter
    (fun (line_num, line) -> Printf.printf "%d: %s\n" line_num line)
    matches;;

We’ve two functions in this file.

  • search_in_file - This function takes four arguments:

    • case_sensitive: A boolean value indicating whether the search should be case sensitive or not.

    • consider_order: A boolean value indicating whether the order of characters in the term matters during the search.

    • term: The string to search for in the file.

    • file: The path to the file to search in.

      The function opens a file for reading. Then it recursively iterates over the lines of the file. On each iteration, the function reads a line, checks if the line contains the term. If the line contains the term, it adds the line number and the line itself as a tuple to an accumulator list. It recursively calls itself on the next line number and the updated accumulator list. When it reaches the end of the file (End_of_file exception), it closes the file and reverses the accumulated list (to display lines in their original order).

  • print_matches - This function takes a list of matches as input where each match is a tuple containing the line number and the line itself. It iterates over the list of matches and prints each match in the format "line_num: line".

I hope this explanation is clear and good to go ahead.

Our Executable Program

This is our bin/main.ml file. Here we will import both of our internal library Search and external library Cmdliner, and then define the command and arguments.

open Cmdliner
open Search

(* Define arguments *)
let search_term =
  let doc = "The term to search for" in
  Arg.(required & pos 0 (some string) None & info [] ~docv:"TERM" ~doc)

let file_name =
  let doc = "The file to search in" in
  Arg.(required & pos 1 (some string) None & info [] ~docv:"FILE" ~doc)

let case_sensitive =
  let doc = "Perform a case-sensitive search" in
  Arg.(value & flag & info ["c"; "case-sensitive"] ~doc)

let consider_order =
  let doc = "Consider the order of characters in the search term" in
  Arg.(value & flag & info ["o"; "consider-order"] ~doc)

(* Core command logic *)
let run case_sensitive consider_order term file =
  let matches = search_in_file ~case_sensitive ~consider_order term file in
  print_matches matches;
  if List.length matches = 0 then
    (Printf.eprintf "No matches found.\n"; exit 1)
  else
    exit 0

(* Create the command *)
let cmd =
  let info =
    Cmd.info "textsearch"
      ~version:"1.0.0"
      ~doc:"Search for a term in a file with options for case sensitivity and character order"
  in
  Cmd.v info
    Term.(const run $ case_sensitive $ consider_order $ search_term $ file_name)

(* Main entry point *)
let () = Stdlib.exit (Cmd.eval cmd)

Let’s break down the bin/main.ml file to understand its components and the utilities of the Cmdline library used here.

After imports, we first define all the arguments as variables. The arguments for our program are defined using Cmdliner.Arg module.
In this program we defined 4 arguments.

  1. search_term (positional argument): - The term (word) to search for in the file.

  2. file_name (positional argument): - The file in which the term is to be searched.

  3. consider_order (flag): - Ensures that the order of characters in the search term is considered.

  4. case_sensitive (flag): - Enables case-sensitive search.

Explanation of the Cmdliner library utilities used here: -

Arg: This module provides functions for defining command-line arguments, including their types, descriptions, and help messages.

  • Arg.required: This function defines a required argument (makes the argument mandatory).

  • Arg.pos: This function specifies the position of an argument in the command-line syntax. So, for example, pos 0 will make the argument the first positional argument.

  • Arg.info: This function provides a docstring and other metadata for an argument.

  • Arg.value: This function defines the value of an argument. The documentation says, - (value a) is a term that evaluates to a's value.

  • Arg.flag: This function defines a flag argument that may appear at most once on the command line. The argument holds true if the flag is present on the command line and false otherwise.

  • Arg.some: This function wraps the argument value in an Option type.

  • Arg.(&): This is an infix operator which performs a right associative composition operation, which simply means Arg.(&) enables composition of the combinators of a command-line argument by connecting individual components (argument’s type, default value, parsing logic and docs) in a declarative way.

After imports we defined the core command logic. The whole command logic is encapsulated in the run function.

The run function calls the search_in_file function from the Search library. Prints the matching lines using print_matches function. Exits with code 1 if no matches are found, else exit code 0.

Next, we define the command in the cmd variable.

And finally, there is the main entry point of our program where it starts its execution from. That is -

let () = Stdlib.exit (Cmd.eval cmd)

Here, the cmd command gets evaluated and then exits the program with the return code from run.

Explanation of all the Cmdliner library utilities used here: -

Cmd: This module is the core of Cmdliner. It provides functions to define the commands, specify their options and arguments, and handle argument parsing, grouping and usage messages.

  • Cmd.info: This function creates a command description, including its name, version, and helptexts (a docstring explaining its purpose).

  • Cmd.v: This function creates a fully functional command combining the provided info and terms.

  • Cmd.eval: This function evaluates the given command, parses arguments, and executes the corresponding logic.

Term: This module deals with creating and manipulating command-line terms (arguments and subcommands).

  • Term.const: This function creates a term that evaluates to a constant value. In the given code, it's used to create terms for the run function and its arguments.

  • Term.($) - An infix operator that combines terms to pass arguments to a function (eventually the command to run from terminal).

So, you finally understood the Cmdline library.

Now it’s time to build and run our project to see if this works or not.

Building & Running the Project

Without waiting, let's build and run it!

Building the Project

  • Open your terminal and navigate to the root directory of your project (where dune-project is located).

  • Run the following command to build the program:

      dune build
    

This command will use the dune files to compile the source code in lib/search.ml and bin/main.ml and create an executable named textsearch in the _build/install/default/bin/ directory.

Now, you can run the executable by this command -

dune exec -- textsearch -c -o <term> "path/to/file/to/search/in"

Here, <term> is the term you want to search in the file.

Replace "path/to/file/to/search/in" with the actual path of the file you want to search the term in.

-c is the optional flag for case-sensitivity,

-o is the optional flag for respecting order of characters.

For example, I use vim and have the vim config file .vimrc located in the $HOME directory of my system.

I want to search my name in the file if it’s there. I would run -

dune exec -- textsearch -c -o Debajyati ~/.vimrc

And the program would print the matching lines in the console.

output of the command I've run

Tadaaa! And it works pretty well and smooth! Neat!

Congratulations! 🎉 You now have a functional CLI tool built with OCaml and Cmdliner!

Ending

I would really appreciate your feedback on this tutorial. How would you rate it? How can I improve it?

If you found this project or tutorial helpful, please consider starring the repository below. It will encourage me to create more content like this.

If you found this POST helpful, if this blog added some value to your time and energy, please show some love by giving the article some likes and share it with your developer friends.

Feel free to connect with me at - Twitter, LinkedIn or GitHub :)

Happy Coding 🧑🏽‍💻👩🏽‍💻! Have a nice day ahead! 🚀

Did you find this article valuable?

Support Debajyati's Blogs by becoming a sponsor. Any amount is appreciated!