This webpage provides tutorials introducing remote computing and command-line tools commonly used in genomics and bioinformatics. The tutorials focus on practical workflows using Linux computers running the Ubuntu distribution.
The material presented here is especially relevant for:
These tutorials will help you develop the computational skills required to connect to remote computers, manage long analyses, and navigate the command-line environment used in genomics research.
After completing these tutorials, you will be able to:
ssh protocol.tmux sessions.These tutorials are organized to guide you through the essential steps required to work on remote Linux systems and perform genomics analyses using the command line.
The material is structured into the following sections:
Remotely access Linux computers
In this section, you will learn how to connect to remote computers using the ssh protocol and understand the basic concepts required for secure remote access.
Safely run computer jobs
This section introduces the tmux tool, which allows you to run and manage long computational jobs in persistent terminal sessions.
Bash command-line reference
This section provides a practical overview of commonly used Bash commands for navigating the file system, managing files, and supporting genomics analyses.
File Exchange
This section provides a practical overview of how to conduct file exchange using FileZilla.
Group Exercises 1: Building a Shared Bioinformatics Workflow
This section provides group activities to help you learn and master skills related to remotely accessing Linux computers, running safe jobs, and using Bash commands.
Group Exercises 2: File Exchange Using FileZilla
This section provides group activities to help you learn how to transfer files between your computer and a remote server.
Each group is composed of 5 to 6 students, with a mix of undergraduate and graduate students to promote peer learning and knowledge exchange.
Each group is divided into two subgroups (A and B). Both subgroups will perform the same analyses in parallel, working independently at first.
This structure is designed to:
Each group has been assigned two accounts:
Each subgroup must use its assigned account to complete the group activities.
You will access the Linux systems using the ssh protocol. Details on your account information are available here.
To remotely access your assigned Linux computer accounts using the ssh protocol, you need the following information:
If you want to remotely access the lab computers from a Windows machine, please read the protocol described below before continuing.
Important: If you are accessing the Linux computers outside of the BSU campus network, you will need to connect through the BSU VPN.
An Internet Protocol (IP) address is a numerical label assigned to each device connected to a computer network that uses the Internet Protocol for communication.
An IP address serves two main functions:
The procedure summarized here shows how to retrieve a computer’s IP address on the Ubuntu operating system (Figure 5.1):
System Settings application from the desktop sidebar.Network tab located under the Hardware section.Wired tab.
Figure 5.1: Screenshot of Ubuntu desktop showing how to retrieve the IP address.
You can also retrieve the IP address from the command line by typing the following commands in a Terminal:
# Get IP address
hostname -I | awk '{print $1}'
ip a
ifconfig -a
ssh ProtocolThe Secure Shell (ssh) protocol is a method used for secure remote login from one computer to another (Figure 5.2).
It protects communication using strong encryption and allows users to remotely execute commands on another machine.
Figure 5.2: Overview of ssh protocol.
Once you have gathered the required information (IP address and username), open a Terminal and type:
# General command
ssh USER_ID@IP
# Example
ssh bioinformatics@132.178.143.53
Figure 5.3: Example of an ssh connection from a Terminal.
If you are using Windows, you can use Putty to establish SSH connections.
Download the software here: http://www.putty.org
When launching Putty (Figure 5.4):
user@IP in the Host Name field.22.
Figure 5.4: Putty configuration window.
Some analyses performed in genomics can take several hours or even days to complete. To avoid interrupting these analyses, it is important to run them in a persistent terminal environment.
The tool we will use for this purpose is tmux, which allows multiple terminal sessions to run simultaneously.
tmux protocoltmux is a terminal multiplexer for Unix-like systems.
It allows users to create multiple independent terminal sessions within a single window.
Key advantages:
tmux on Macbrew install tmux
tmux Sessionstmux
Press:
Ctrl+b then d
tmux attach
tmux new -s JOB1
tmux list-sessions
tmux attach -t JOB1
tmux kill-session -t JOB1
Warning: Killing a session will terminate any running analysis.
Below is a collection of common Bash commands used to navigate the file system and manage computational analyses.
ls — list files in a directorycd — change directorypwd — print working directorymkdir — create a directoryrm — remove a filecp — copy filesmv — move or rename filescat — display file contentsless — view files page by pagehead — show the first lines of a filetail — show the last lines of a filewhoami — display current userdate — display date and timeexit — exit terminaldf -h — disk usagefree — memory usageps — list active processestop — show running processeshtop — interactive process viewerkill PID — terminate processssh user@host — connect to remote hostscp — transfer files between machineswget — download filescurl — download filesgrep — search text patternsfind — locate filessed — text substitutiontar — archive filesgzip — compress filesBy the end of this section, you will be able to:
In bioinformatics, analyses are often conducted on remote computing systems, while data visualization and reporting are performed locally. Efficient and reliable file transfer is therefore essential for:
Developing good file management and transfer practices is critical for ensuring reproducibility and data integrity.
SFTP (Secure File Transfer Protocol) is a network protocol used to securely transfer files between your local computer and a remote server.
Unlike older file transfer methods (such as FTP), SFTP encrypts both the data and the login credentials during transmission. This ensures that your files and passwords are protected from unauthorized access.
ssh (port 22)FileZilla is a graphical SFTP client that simplifies file transfer between your computer and a remote server.
When you connect using FileZilla:
22)ssh loginIn this course, you will use SFTP through FileZilla to move data between your local computer and remote Linux systems.
FileZilla is a free, open-source SFTP client available for Windows, macOS, and Linux.
Notes:
To connect using FileZilla, you will need:
22 (SFTP)Once connected, the FileZilla interface is divided into several panels:
These exercises reinforce:
ssh)tmuxIn this 50-minute group activity, you will work in groups of 5–6 students, divided into two subgroups (A and B), to build and manage a shared bioinformatics workspace on a remote Linux system.
Through this activity, you will:
sshtmux to safely run and manage persistent computational jobsThis activity is designed to simulate how bioinformatics teams collaborate on shared computing infrastructure. You will experience the importance of clear communication, careful file management, and responsible use of shared resources.
By the end of this session, you will be prepared to confidently work on remote systems, manage your own analysis environment, and collaborate effectively on computational genomics projects.
These exercises are designed to simulate real-world bioinformatics collaboration.
You will work in:
⚠️ Important:
Goal: Successfully access your shared system
ssh. See here for account details. Once you have gathered the required information (IP address and username), open a Terminal and type (see here if you use Putty):# General command
ssh USER_ID@IP
# Example
ssh bio_40@132.178.143.53
# Retrieve your username
whoami
# Retrieve your current path
pwd
ssh connection:exit
👉 Deliverable: Shared notes describing the login procedure
By completing these exercises, you will be able to:
In this 25-minute group activity, you will work in groups of 5–6 students, divided into two subgroups (A and B), to learn how to transfer files between your local computer and the remote system, and verify file integrity.
The key takeaways from this activity are as follows:
Each subgroup should:
groupA_test.txt or groupB_test.txt)This is a test file for file transfer.
Subgroup: A or B
Using FileZilla:
~/Tutorial_genomics
To upload your file to the remote server, do the following steps:
Connect via ssh and run:
cd ~/Tutorial_genomics
ls -lh
cat groupA_test.txt # or groupB_test.txt
vim, nano):vim groupA_test.txt
scp, rsync) be preferable?Use the protocol you just learned to transfer the ORF map from Module 2 to your local computer.